Asterisk (Evolution) and Cisco 7960 phones

Asterisk plays very nicely with Cisco 7960 phones, and we have quite a few of them. Unfortunately there are some interesting configuration issues, and I thought I’d get them all down, with explanations and workarounds.

We were having a problem where our phones, running the P0S3-08–2‑00, P0S3-07–4‑00, or P0S3-07–5‑00 firmware, were unable to make a reliable connection. Or, rather, the connection was made but neither side could hear the other. This was not consistent, with calls working perfectly about 25–50% of the time. Obviously this is not good for a phone system, so I spent many hours trying to debug the issue. Intuitive Voice was very helpful, including reinstalling the entire operating system so we could test it out.

For a while I thought it might be because my PBX firewall didn’t allow me to pass through more than 5000 ports (meaning the ports the system wanted, which ranged from 10000 to 55000, couldn’t all be used). While a good idea, that turned out to be the answer to a completely different problem.

Then I thought the problem might be something with our T‑1 provider, but that didn’t hold up because we had the same problems with calls to and from other extensions on the system.

So I tried a softphone. Worked perfectly. That meant the problem was somewhere on the Cisco phones, which are generally big black boxes — there’s no good way to see inside. Asterisk itself gave no hints — log files were identical for calls that worked and ones that didn’t.

So, at the suggestion of Intuitive Voice, I started trying to reload old firmware, starting with P0S30201. That one didn’t work on the phone I was using. Neither did P0S30203, but P0S3-06–3‑00 did take. I did some testing with P0S3-06–3‑00 just for the heck of it, and wouldn’t you know, it worked perfectly more than 95% of the time! Every so often the global mute would happen, but by and large it was extremely stable!

So the solution became easy — I pushed out the downgrade to all my phones and poof! No more problem!

(To push a downgrade or, for that matter, an upgrade, do the following:

  1. Update SIPDefault.cnf to reflect the correct flash version (“image_version: P0S3-06–3‑00”)
  2. Update any SIP<mac>.cnf files to reflect the correct flash version, if image_version is set in any of those files.
  3. Modify OS79XX.TXT to show the correct version “P0S3-06–3‑00”
  4. Note that it’s PzeroS, not point-of-sale.)

Some of my employees had an issue, however — their update never worked, and rebooting their phones took a really long time. The phones use TFTP to grab their configuration files, and we had our own TFTP server going, so it made it relatively easy to debug. Looking at the log files (“tail ‑f /var/log/messages) revealed that the problem phones kept asking for CTLSEP<mac>.tlv over and over again, then they’d ask for SIPDefault.cnf a few times, then the normal SIP<mac>.cnf, then dialplan.xml and RINGLIST.DAT twice. This is not normal behavior for these phones — typically they’ll only ask for each file once, as long as it exists, or two or three times at most if the files don’t exist.

Various websites said we should create an empty CTLSEP<mac>.tlv, or remove it in the middle of the request. That did not work. We tried resetting the phones to factory defaults (basically hold down # as it boots, then press 123456789*0# then “1” when it asks, though in retrospect “2” would have been a better choice), but ended up bricking the phones (to unbrick, I need to set up a DHCP server that also provides the TFTP address — not so fun).

It turned out that some firewalls don’t correctly handle the ephemeral port TFTP needs. Many file transfer protocols open on a specific port, then open a much higher port to actually do the transfer. Most firewalls correctly handle FTP in this manner by noticing that an inside user has opened an FTP connection (port 21 is the control port for that) and then allowing the FTP site to open a higher port to the inside computer. TFTP is much less commonly used, and I guess some firewalls don’t have the same kind of logic built in. So the phone makes the TFTP connection and asks for the file. The TFTP server says, “got it” or “don’t got it.” In the latter case, the phone tries a couple more times for whatever reason and then moves on to the next file. In the former case, it then tries to make a high-port connection but it doesn’t work. This confuses the phone, which never hears a useful response so doesn’t move on to the next kind of file correctly.

The solution was to either give the phones a static IP and make them a DMZ in the firewall setup, or to just move the phones outside the firewall. They only need to be outside the firewall for long enough to get their systems and conf files updated; after that, they can be moved safely back inside the firewall.

Incidentally, if you don’t want to go unplugging the phones all the time (which is annoying, actually, because the cord is in a very inconvenient place to be extracting and reinserting) you can telnet to the phones if they’re set up to allow that. In my .cnf files I have

  • telnet_level: “2”
  • phone_password: “mypassword”

Level 2 is privileged, and the password is obvious. That way I can telnet into the phones and type “reset”, which will reboot the phone.