Learning VoIP, RTP and SIP (aka awesome pjsip)
Before working with Windows Phone and iOS, my life involved researching VoIP. That was to build a C library for voice over IP functionality for a very popular app, and that was how I got started in open source.
The library I was working with were Linphone and pjsip. I learn a lot of UDP and SIP protocol, how to build C library for consumption in iOS, Android and Windows Phone, how challenging it is to support C++ component and thread pool in Windows Phone 8, how to tweak entropy functionality in OpenSSL to make it compile in Windows Phone 8, how hard it was to debug C code with Android NDK. It was time when I needed to open Visual Studio, Xcode and Eclipse IDE at the same time, joined mailing list and followed gmane. Lots of good memories.
Today I find that those bookmarks I made are still available on Safari, so I think I should share here. I need to remove many articles because they are outdated or not available anymore. These are the resources that I actually read and used, not some random links. Hopefully you can find something useful.
This post focuses more about resources for pjsip on client and how to talk directly and with/without a proxy server.
Here are some of the articles and open sources made by me regarding VoIP, hope you find it useful
rtpproxy: I forked from http://www.rtpproxy.org/ and changed code to make it support for IP handover. It means the proxy can handle when IP changes from 3G, 4G to Wifi and to reduce chances of attacks
Voice over Internet Protocol (also voice over IP, VoIP or IP telephony) is a methodology and group of technologies for the delivery of voice communications and multimedia sessions over Internet Protocol (IP) networks, such as the Internet
Voice over IP Overview: introduction to VoIP concepts, H.323 and SIP protocol
Voice over Internet Protocol the wikipedia article contains very foundation knowledge
Open Source VOIP Software: this is a must read. Lots of foundation articles about client and server functionalities, SIP, TURN, RTP, and many open sources framworks
VOIP call bandwidth: a very key factor in VoIP application is bandwidth consumption, it’s good to not going far beyond the accepted limit
Routers SIP ALG: this is the most annoying, because there is NAT and many types of NAT, also router with SIP ALG
SIP SIMPLE Client SDK: introduction to SIP core library, but it gives an overview of how
The Session Initiation Protocol (SIP) is a communications protocol for signaling and controlling multimedia communication sessions in applications of Internet telephony for voice and video calls, in private IP telephone systems, as well as in instant messaging over Internet Protocol (IP) networks.
RFC 3261: to understand SIP, we need to read its standard. I don’t know how many times I read this RFC.
OpenSIPS: OpenSIPS is a multi-functional, multi-purpose signaling SIP server
SIP protocol structure through an example: this is a must read, it shows very basic but necessary knowledge
Relation among Call, Dialog, Transaction & Message: basic concepts about call, dialog, transaction and message
microSIP: Open source portable SIP softphone for Windows based on PJSIP stack. I used to use this to test my pjsip tweaked library before building it for mobile
What is SIP: introduction to SIP written by the author of CSipSimple
SIP by Wireshack: introduction to SIP written by Wireshack. I used Wireshack a lot to intercept and debug SIP sessions
Solving the Firewall/NAT Traversal Issue of SIP: this shows how NAT can be a problem to SIP applications and how NAT traversal works
SIP Retransmissions: what and how to handle retransmission
draft-ietf-sipping-dialogusage-06: this is a draft about Multiple Dialog Usages in the Session Initiation Protocol
Creating and sending INVITE and CANCEL SIP text messages: SIP also supports sending text message, not just audio and video packages. This isa good for chat application
- Kamailio: this is the server that I used, and it plays well with lots of standard SIP clients, including pjsip. Debugging on this server was also a fun story
Configuring NAT traversal using Kamailio 3.1 and the Rtpproxy server: I don’t know how many times I had read this post
How to set up and use SIP Server on Windows: I used this to test a working SIP server on Windows
OpenSIPS/Kamailio serving far end nat traversal: discussion about how Kamailio deals with NAT traversal
NAT Traversal Module: how NAT traversal works in Kamailio as a module
RTP, SIP clients and server need to conform to some predefined protocols to meet standard and to be able to talk with each other. You need to read RFC a lot, besides you need to read some drafts.
NAT solves the problem with lack of IP, but it causes lots of problem for SIP applications, and for me as well 😂
Network address translation: Network address translation (NAT) is a method of remapping one IP address space into another by modifying network address information in the IP header of packets while they are in transit across a traffic routing device
Configuring Port Address Translation (PAT): how to configure port forwarding
Types Of NAT Explained (Port Restricted NAT, etc): This is a must read. I didn’t expect there’s many kinds of NAT in real life, and how each type affects SIP application in its own way
One Way Audio SIP Fix: sometimes we get the problem that only 1 person can speak, this talks about why
NAT traversal for the SIP protocol: explains RTP, SIP and NAT
SIP NAT Traversal: This is a must read. How to make SIP work under NAT
NAT and Firewall Traversal with STUN / TURN / ICE: pjsip and Kamailio actually supports STUN, TURN and ICE protocol. Learn about these concepts and how to make it work
Learn how TCP helps SIP in initiating session and to turn in TCP mode for package sending
Transmission Control Protocol: The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP)
Datagram socket: A datagram socket is a type of network socket which provides a connectionless point for sending or receiving data packets. Each packet sent or received on a datagram socket is individually addressed and routed
TCP RST packet details: learn the important of RST bit
Where do resets come from? (No, the stork does not bring them.): learn about 3 ways handshake in TCP connection
Sockets and Ports: Do not confuse between socket and port
Learn about Transport Layer Security and SSL, especially openSSL for how to secure SIP connection. The interesting thing is to read code in pjsip about how it uses openSSL to encrypt messages
Configuring TLS support in Kamailio 3.1 — Howto: learn how to enable TLS mode in Kamailio
SIP TLS: how to configure TLS in Asterisk
Learn about Interactive Connectivity Establishment, another way to workaround NAT
STUN: STUN (Simple Traversal of UDP through NATs (Network Address Translation)) is a protocol for assisting devices behind a NAT firewall or router with their packet routing. RFC 5389 redefines the term STUN as ‘Session Traversal Utilities for NAT’.
Learn about [Application Layer Gateway](http://Application Layer Gateway) and how it affects your SIP application. This component knows how to deal and modify your SIP message, so it might introduce unexpected behaviours.
Understanding SIP with Network Address Translation (NAT): This is a must read, a very thorough document
Learn about voice quality, bandwidth and fixing delay in audio
RTP, Jitter and audio quality in VoIP: learn about the important of jitter and RTP
An Adaptive Codec Switching Scheme for SIP-based VoIP: explain codec switching during call in SIP based VoIP
This is a very common problem in VoIP, sometimes we hear voice from the other and also from us. Learn how echo is made, and how to effectively do echo cancellation
Echo Cancellation: How to use Speex to cancel echo
Echo and Sidetone: A telephone is a duplex device, meaning it is both transmitting and receiving on the same pair of wires. The phone network must ensure that not too much of the caller’s voice is fed back into his or her receiver
How software echo canceller works?: I asked about how we use software to do echo cancellation
Learn how to generate dual tone to make signal in telecommunication
PJSIP is a free and open source multimedia communication library written in C language implementing standard based protocols such as SIP, SDP, RTP, STUN, TURN, and ICE. It combines signaling protocol (SIP) with rich multimedia framework and NAT traversal functionality into high level API that is portable and suitable for almost any type of systems ranging from desktops, embedded systems, to mobile handsets.
PJSUA API — High Level Softphone API: high level usage of pjsip
Stateful Operations: common functions to send request statefully
Message Creation and Stateless Operations: functions related to send and receive messages
Understanding Media Flow: this is a must read. The media layer is so important, it controls sound, codec and conference bridge.
Getting Started: Building and Using PJSIP and PJMEDIA: This article describes how to download, customize, build, and use the open source PJSIP and PJMEDIA SIP and media stack
Codec Framework: pjsip supports multiple codec
Adaptive jitter buffer: this takes sometime to understand, but it plays an important part in making pjsip work properly regarding buffer handling
PJSUA-API Accounts Management: how to register account in pjsua
Building Dynamic Link Libraries (DLL/DSO): how to build pjsip as a dynamic library
Compile time configuration: lots of configuration we can apply to pjsip
Fast Memory Pool: pjsip has its own memory pool. It’s very interesting to look at the source code and learn something new
Using SIP TCP Transport: How to enable TCP mode in SIP and to initiate SIP session
Monochannel and multichannel audio frame converter: interesting read about mono and multi channel
IOQueue: I/O Event Dispatching with Proactor Pattern: the code for this is very interesting and plays a fundamental in how pjsip handles events
DNS Asynchronous/Caching Resolution Engine: how pjsip handles DNS resolution by itself
Secure socket I/O: the code for this is important if you want to learn how to use SSL under the hood
Multi-frequency tone generator: I learn a lot how pjsip uses sin wave to generate tone
SIP SRV Server Resolution (RFC 3263 — Locating SIP Servers): learn the mechanism for how pjsip finds a particular SIP server
Exception Handling: how to do Try Catch in C
Mutex Locks Order in PJSUA-LIB: how multiple locks at each layer helps ensure correction and avoid deadlocks. I had lots of nightmare debugging deadlocks with pjsip 😱
pjsip uses Local Thread Storage which introduces very cool behaviors
Threads question: how pjlib handles thread
Using Thread Local Storage: how to use TlsAlloc and TlsFree in Windows
Example: Thread local storage in a Pthread program: how Pthread works
Thread Local Storage: learn about pj_thread
How to work with sample rate of the media stream
Resample Port: how to perform resampling in pjmedia
Resampling Algorithm: code to perform resampling
Samples: Using Resample Port: very straightforward example to change sample rate of the media stream
How to Record Audio with pjsua: how to use pjsua to record audio.
Memory/Buffer-based Capture Port: believe me, you will jump into pjmedia_mem_capture_create a lot
File Writer (Recorder): record audio to .wav file
AMR Audio Encoding: understands AMR encoding
Audio Device API: how pjsip detects and use Audio device
Sound Device Port: Media Port Connection Abstraction to the Sound Device
bad quality on iphone 2G with os 3.0: No one would use iPhone 2G now, but it’s good to be aware of older phones
getting Underflow, buf_cnt=0, will generate 1 frame continuessly: how to handle underflow in pjmedia
Measuring Sound Latency: This article describes how to measure both sound device latency and overall (end-to-end) latency of pjsua
Master/sound: How master sound works and deal with no sound on the mic input port
I learn a lot regarding video capture, ffmpeg and color space, especially YUV
siphon — VIdeoSupport.wiki: How siphon deals with video before pjsip 2.0
Video Device API; PJMEDIA Video Device API is a cross-platform video API appropriate for use with VoIP applications and many other types of video streaming applications.
PJSUA-API Video: Uses video APIs in pjsua with pjsip 2.1.0
PJSIP Video User’s Guide: all you need to know about video support in pjsip
Video streams: I can’t never forget pjmedia_vid_stream_create
Video source duplicator: duplicate video data in the stream.
AVI File Player: Video and audio playback from AVI file
PJSIP Version 2.0 Release Notes: starting with 2.0, pjsip supports video. Good to read
FFmpeg-iOS-build-script: details how to build ffmpeg for iOS
There are many SIP client for mobile and desktop, microSIP, Jitsi, Linphone, Doubango, … They all follow strictly SIP standard and may have their own SIP core, for example microSIP uses pjsip, Linphone uses liblinphone, …
Among that, I learn a lot from the Android client, CSipSimple, which offers very nice interface and have good functionalities. Unfortunately Google Code was closed, so I don’t know if the author has plan to do development on GitHub.
You can read What is a branded version
I don’t make any money from csipsimple at all. It’s a pure opensource and free as in speech project.
I develop it on my free time and just so that it benefit users.
That’s the reason why the project is released under GPL license terms. I advise you to read carefully the license (you’ll learn a lot of things on the spirit of the license and the project) : http://www.gnu.org/licenses/gpl.html
To sump up, the spirit of the GPL is that users should be always allowed to see the source code of the software they use, to use it the way they want and to redistribute it.
Because of NAT or in case users want to talk via a proxy, then a RTP proxy is needed. RTPProxy follows standard and works well with Kamailio
IP change during call can cause problem, such as when user goes from Wifi to 4G mode
Learn about [Realtime transport control protocol](http://Real-time Transport Protocol) and how that works with RTP
Windows Phone 8 introduces C++ component , changes in threading, VoIP and audio background mode. To do this I need to find another threadpool component and tweak openSSL a bit to make it compile on Windows Phone 8. I lost the source code so can’t upload the code to GitHub 😢. Also many links broke because Nokia was not here any more
Porting to New CPU Architecture: pjlib is the foundation of pjsip. Learn how to port it to another platform
Firstly, learn how to compile, use OpenSSL. How to call it from pjsip, and how to make it compile in Visual Studio for Windows Phone 8. I also learn the important of Winsock, how to port a library. I struggled a lot with porting openSSL to Windows RT, then to Windows Phone 8
A lot of links were broken 😢 so I can’t paste them all here.
Since pjsip, rtpproxy and kamailio are all C and C++ code. I needed to have a good understanding about them, especially pointer and memory handling. We also needed to learn about compile flags for debug and release builds, how to use Make, how to make static and dynamic libraries.
comp.lang.c Frequently Asked Questions: there’s lot of things about C we haven’t known about
Bit Twiddling Hacks: how to apply clever hacks with bit operators. Really really good reading here