Glossary of Australian Telecom Terms for AI Voice
AI voice technology sits at the intersection of telecommunications and artificial intelligence. If you're deploying ai call center solutions in Australia, you'll encounter terminology from both worlds — plus a layer of Australian-specific regulation.
This glossary defines the terms you'll actually need. No fluff. Organised alphabetically, with context for how each term applies to AI voice calling.
A
ACMA (Australian Communications and Media Authority)
Australia's regulator for telecommunications, broadcasting, and the internet. ACMA enforces telemarketing rules, manages the Do Not Call Register, and can issue fines for non-compliant calling practices — including AI calls.
If you're running an ai call center in Australia, ACMA is the regulator you need to stay on the right side of.
Why it matters for AI: ACMA has made clear that AI callers must comply with the same rules as human callers. There are no special exemptions for automated systems.
The Complete Guide to AI Voice Compliance in Australia
ASR (Automatic Speech Recognition)
The technology that converts spoken words into text. Also called STT (Speech-to-Text). ASR is the first step in any AI voice pipeline — the AI needs to understand what the caller said before it can respond.
Australian context: ASR accuracy drops significantly when models are trained on American English and deployed with Australian speakers. Australian accents, place names, and slang create recognition errors that compound throughout the conversation.
ATA (Average Time to Answer)
The average time a caller waits before their call is answered. For ai call center solutions, ATA should be near-zero since AI agents can answer instantly. Compare this to human call centres where ATA commonly exceeds 60 seconds.
B
Barge-In
When a caller starts speaking before the AI has finished its current sentence. Handling barge-in gracefully is one of the hardest problems in AI voice. The system needs to detect that the human is speaking, stop its own output, process what was said, and respond naturally.
High latency makes barge-in worse. If there's a 500ms delay, the AI keeps talking for half a second after the human starts, creating an awkward overlap.
Handling Barge-In: Why High Latency Kills AI Voice in Australia
BPO (Business Process Outsourcing)
The practice of contracting business functions to third-party providers. In the Australian context, BPO often refers to offshore call centre operations in the Philippines, India, or Fiji. AI voice is increasingly positioned as an alternative to both onshore and offshore BPO.
C
CLI / CLID (Calling Line Identification)
The phone number displayed to the person receiving a call. Also known as "Caller ID." In Australia, ACMA requires that telemarketing calls display a valid, callable Australian number. Suppressing or spoofing CLI is prohibited.
For AI calling: Your system must present a legitimate Australian phone number that the recipient can call back. Using overseas numbers or rotating through disposable numbers will attract ACMA scrutiny.
CPS (Calls Per Second)
The rate at which an outbound dialler initiates calls. Enterprise ai call center platforms need to manage CPS carefully — too high and you'll overwhelm your telephony provider; too low and campaigns take too long.
CRM (Customer Relationship Management)
Software that stores customer data and interaction history. AI voice systems integrate with CRMs (Salesforce, HubSpot, Pipedrive, Zoho) to personalise calls and log outcomes. The quality of your CRM data directly affects AI call quality.
D
DNC / DNCR (Do Not Call Register)
Australia's national register of phone numbers that have opted out of telemarketing calls. Managed by ACMA. Before making outbound AI calls, you must scrub your dial list against the DNCR. Calling a registered number can result in fines of up to $2.5 million per contravention.
The Do Not Call Register: How to Scrub Your AI Dial Lists
DTMF (Dual-Tone Multi-Frequency)
The tones generated when you press keys on a phone keypad. Traditional IVR systems use DTMF for navigation ("Press 1 for sales"). AI voice systems are replacing DTMF-based menus with natural conversation, though many still fall back to DTMF for specific actions like entering account numbers.
E
E.164
The international standard for phone number formatting. Australian numbers in E.164 format start with +61 (e.g., +61400000000). Your ai call center platform should handle E.164 formatting for all outbound calls and CRM integrations.
Endpointing
The process of detecting when a speaker has finished talking. Good endpointing is critical for natural AI conversation — too aggressive and the AI cuts people off; too conservative and there are awkward silences. Australian speech patterns (including the rising inflection) can confuse endpointing models trained on American English.
F
Full Duplex
A communication mode where both parties can speak and listen simultaneously — like a normal phone conversation. AI voice systems must support full duplex to feel natural. Half-duplex systems (where only one party can speak at a time) feel robotic and frustrating.
G
Gateway (SIP Gateway)
Hardware or software that connects different telephone networks. SIP gateways bridge VoIP (internet-based calling) and PSTN (traditional phone networks). Your AI calling platform uses gateways to connect to Australian phone numbers.
H
Hallucination
When an AI generates information that sounds plausible but is factually incorrect. In AI voice, hallucinations are especially dangerous — the AI might confidently state wrong business hours, incorrect pricing, or inaccurate policy details during a live phone call.
Hold Music / Hold Prompt
Audio played while a caller waits. AI voice systems can eliminate hold times entirely for routine queries. For transfers to humans, AI can provide status updates instead of generic hold music.
I
IVR (Interactive Voice Response)
The traditional phone menu system — "Press 1 for sales, press 2 for support." IVR uses pre-recorded prompts and DTMF input. AI voice agents are rapidly replacing IVR with natural conversation that understands what callers actually say, rather than forcing them through rigid menu trees.
J
Jitter
Variation in the delay between data packets arriving. High jitter in voice calls causes choppy audio, missing words, and robotic-sounding speech. Australian connections to US-based AI platforms are particularly susceptible to jitter due to the long network path.
L
Latency
The delay between when something is said and when a response begins. In AI voice, total latency includes network transit, speech recognition, language model processing, and speech synthesis. Acceptable conversational latency is under 800ms. Australian calls to US-hosted AI platforms typically exceed 1 second.
What Is Latency and Why It Impacts AI Call Quality
LLM (Large Language Model)
The AI that powers the "brain" of a voice agent. LLMs (GPT, Claude, Gemini, etc.) process the transcribed speech and generate an appropriate response. The quality of the LLM directly affects how natural and accurate the AI's conversation is.
M
MOS (Mean Opinion Score)
A standardised measure of voice quality, rated 1-5. A MOS of 4.0+ indicates toll-quality (indistinguishable from a normal phone call). AI voice systems should target MOS above 3.5 at minimum. Latency and jitter both reduce MOS.
N
NBN (National Broadband Network)
Australia's national broadband infrastructure. NBN replaced copper telephone lines with a mix of fibre, fixed wireless, and satellite connections. For AI voice, NBN provides the internet connectivity that VoIP calls travel over. NBN quality varies significantly by technology type and location.
NLU (Natural Language Understanding)
The AI component that interprets the meaning of what someone said, not just the words. NLU determines intent ("I want to book an appointment" vs. "What are your hours?") and extracts entities (dates, names, numbers). Strong NLU is what separates useful AI agents from frustrating ones.
O
On-Net / Off-Net
On-net calls stay within the same carrier's network. Off-net calls travel between different carriers. On-net calls typically have lower latency and cost. For AI calling platforms, choosing an Australian SIP provider with strong on-net relationships reduces call quality issues.
P
PBX (Private Branch Exchange)
A private telephone system within a business. Modern cloud PBX systems can integrate with AI voice platforms, routing calls between AI agents and human staff based on rules, time of day, or caller intent.
PSTN (Public Switched Telephone Network)
The traditional "landline" phone network. While VoIP is dominant, AI calling platforms still connect to PSTN for reliability and reach. All Australian phone numbers (landline and mobile) are reachable via PSTN.
R
RTP (Real-time Transport Protocol)
The protocol used to transmit voice data in real-time over the internet. RTP handles the actual audio stream in VoIP calls. AI voice platforms use RTP or WebRTC for voice data transport.
S
SIP (Session Initiation Protocol)
The standard protocol for setting up, managing, and terminating VoIP calls. SIP handles the "handshake" — connecting the call, negotiating audio quality, and managing the session. Almost every AI calling platform uses SIP to connect to phone networks.
SIP Trunking
A virtual phone line that connects your system to the telephone network via SIP. Instead of physical phone lines, SIP trunks deliver calls over the internet. AI call centres use SIP trunks to make and receive calls at scale. Australian SIP trunk providers include Twilio, Vonage, and local operators like Symbio and Telstra Programmable.
STT (Speech-to-Text)
See ASR. The process of converting audio speech into written text. STT accuracy is measured by Word Error Rate (WER). For Australian AI voice, STT models need to handle Australian accents, place names (Wooloomooloo, Toowoomba, Joondalup), and colloquialisms.
T
TCP Code (Telecommunications Consumer Protections Code)
An ACMA-registered industry code that sets rules for how telcos deal with customers. Relevant sections cover telemarketing, credit management, and complaint handling. AI calling platforms that operate as carriers or carriage service providers must comply.
TIO (Telecommunications Industry Ombudsman)
Australia's independent dispute resolution body for telecommunications complaints. If a customer has a complaint about AI calls that can't be resolved directly, TIO may get involved. TIO complaints are taken seriously by ACMA and can trigger broader regulatory review.
TTS (Text-to-Speech)
The technology that converts written text into spoken audio. TTS is the final step in the AI voice pipeline — after the LLM generates a response, TTS speaks it aloud. Modern neural TTS produces remarkably natural speech, especially when trained on Australian accents.
Australian context: TTS quality varies dramatically by accent. Australian-trained TTS sounds natural. US-trained TTS attempting an Australian accent often falls into the "uncanny valley."
Why Australian Accents Boost AI Call Conversions
Turn-Taking
The natural rhythm of conversation where speakers alternate. Humans manage turn-taking instinctively with pauses of 200-300ms. AI voice systems must replicate this timing — too fast feels aggressive, too slow feels unresponsive.
V
VAD (Voice Activity Detection)
Technology that determines when someone is speaking versus when there's silence or background noise. VAD is critical for AI voice — it tells the system when the human has started and stopped talking, enabling the AI to respond at the right moment.
What Is VAD (Voice Activity Detection)?
VoIP (Voice over Internet Protocol)
Transmitting voice calls over the internet rather than traditional phone lines. All AI calling platforms use VoIP for the AI processing side. Calls are then bridged to PSTN for delivery to regular phone numbers.
W
WebRTC (Web Real-Time Communication)
A browser-based standard for real-time audio and video communication. Some AI voice platforms use WebRTC for the agent-side audio stream. WebRTC handles echo cancellation, noise suppression, and codec negotiation automatically.
WebSocket
A communication protocol that provides full-duplex communication over a single connection. AI voice platforms use WebSockets for real-time audio streaming between the telephony system and the AI processing pipeline. WebSockets are faster than HTTP for continuous data streams.
WER (Word Error Rate)
The standard metric for speech recognition accuracy. Measured as the percentage of words incorrectly transcribed. A WER of 5% means 95 out of 100 words are correct. Australian-accented speech typically has higher WER on US-trained models — sometimes 2-3x higher than American English.
Using This Glossary
This glossary covers the terminology you'll encounter when evaluating, deploying, and managing ai call center solutions in Australia. Bookmark it. Share it with your team. Come back to it when a vendor throws jargon at you.
The intersection of telecom and AI creates a lot of acronyms. Understanding what they mean — and how they apply to Australian business — puts you in a much stronger position to make good decisions.
Ready to put this knowledge into practice? Start your free trial at voxworks.ai.

