Category Archives: Network Communications

Redundant Fortigate VPN with Cradlepoint

This post is working from the following assumptions:

  • The Cradlepoint devices are using Cradlepoint’s cloud management service. Otherwise the routing between the Cradlepoint and the protected network needs to be set up. I was unable to get this to work, though I didn’t put any time in to it since I didn’t have to get it to work.
  • A pseudo “Spoke-Hub” setup, with the redundancy of the hub to the spoke not being of great concern. In this case our hub site is a cloud provider that uses Fortinet’s hosted version of a Fortigate.
  • I am not an expert in this field, some of these steps may not be needed or the configuration suboptimal in some ways. I am just relating what worked for me.

The initial setup of the spoke site is a simple site-to-site VPN utilizing static IP addresses at each site:

That link will need to be redone since the new connection at the spoke site will need to be an aggregate VPN and the existing IPsec tunnel cannot be set as an aggregate member. I recommend using the USB configuration load so that if the process goes south, the Fortigate can be rebooted and the old, working configuration reloaded.

First, the Cradlepoint needs to be set to a dedicated interface on the Fortigate. For most of the sites I used Wan2, though at one I had to take a port out of the Lan configuration. Optimally this port will be part of a separate network so that a system can be hooked to other ports in the set for diagnosing subnet specific issues, but, I forget to do that every time. With that set, I set two equally weighted static routes to the static IP address of the Hub, one through the existing gateway, and one through the internal address of the Cradlepoint. (I did this because the only thing I wanted routed over the Cradlepoint was work traffic.)

Next, make a new VPN connection at the Hub that will listen for the Spoke connection. This will be a “dial up” VPN connection; I used IKE2 and Forced NAT transversal based on recommendations from Fortinet’s support forum:

The at the spoke site I then setup a matching VPN connection, being careful to mark as an aggregate member:

Next, setup a new Redundant IPsec aggregate on the Spoke and add the new VPN connection to it. On the Hub site add a new, equally weighted, static route to the Spoke’s network using the new VPN connection made at the hub and add policy rules allowing traffic over it.

So far this has been non-destructive, but the next step will interrupt the connection for a bit, depending on how fast you are. On the spoke, change the default gateway to the hub network from the existing VPN connection to the IPsec aggregate, and then change the policy rules used to allow traffic to (and from) the hub to use the IPsec aggregate as well. At this point, the VPN connection over the Cradlepoint network should come up, if not diagnose and fix the issue (every time I had an issue here it was because my settings in the VPN connections did not match up).

Next, delete the old VPN connection on the Spoke system that went to the Hub network and recreate it, this time as an aggregate connection.

In this next step I usually add the new, re-done, VPN connection to the aggregate and then remove the Cradlepoint aggregate to more throughly test the connection. Once I confirm that it works I add the Cradlepoint back into the aggregate. I then test the redundancy by “breaking” the regular VPN connection at the hub (by changing the passcode, etc.) and the Fortigate should fail-over to the Cradlepoint VPN. When I “un-break” it, it should fail-back (confirm by resetting the statistics on the VPN monitor).

Remote Voicemail on the BCM

For quite a while we have relied on dialing ‘**’ when dialing into the BCM to check our voicemail. However, our snazzy new touchscreen cellphones don’t handle ‘clicking’ the asterisk in a quick, consecutive fashion. I looked around on the Internet for an alternative, but then realized that I could just use the number that is used by the voicemail system as it’s ‘forward’ destination (ours is 300, but yours may be different).

Sykes and AT&T

Our AT&T sales rep got us a pretty good deal on a new MPLS implementation, but the install left a lot to be desired.  Our previous MPLS install was handled by DPSciences and they did a great job of making sure all the technical aspects were handled and of making sure all the billing issues that accompany such an install were handled as well.  However, for the new implementation AT&T used Sykes and they did a wretched job of managing the install:

  • We weren’t given proper ETAs for installs.  This resulted in the install at one site being done improperly and the install at a different site not being scheduled at all.  This the led to a sticky billing issue where we were billed for circuits that were ready to be turned up, but couldn’t be utilized until the last circuit was brought up.
  • I asked the Sykes tech if our Class of Service was brought over.  The lack of response should have told me that something was up and it turned out that it wasn’t applied at all by their incompetent tech.  The lack of response didn’t register any higher in my mind since their tech never responded to me about anything.
  • They didn’t bring over an important custom configuration for one of our routers as I nagged them to do.  How hard is it to save a router configuration and then apply it to a different router that’s the same make and model?!?
  • It took a week of incessant nagging on my part in trying to figure out what was wrong with our new MPLS implementation before they eventually got back to me to let me know that there were routing errors only because one of our old MPLS routers was left on.  Yes, I guess I should have known, but would it have killed them to let me know that couple days sooner?

As a side note, we had an AT&T MIS Internet circuit put in by ACS and they did a great job as well.  There are plenty of other great AT&T outsource choices besides the apparently awful Sykes.

(UPDATE 3/3/11: Since so many users from AT&T have apparently taken an interest in this post I figured I would elaborate on the one aspect of the situation.  On our original MPLS implementation, which had chronic maintenance issues, we had a class of service configuration to accommodate primarily voice traffic, and secondarily Outlook and RDP/ICA traffic.  Despite e-mails and phone calls, Sykes decided not to bring this over and they closed out the install, meaning that I had to go through (several) change procedures before it was operating properly).

SSL on the Nortel BCM

All of these concepts are probably familiar to those in the know, but I wasn’t able to put the pieces together until I upgraded to Windows 7 and found that without a properly working SSL configuration Windows 7 wasn’t going to load up the BCM system administration utility.

The documentation for the Nortel BCM states to go to the ‘Maintenance’ section, and then ‘Maintenance Tools’ (well it doesn’t say that but I found it anyway), and then ‘Upload a Certificate and Private Key’.  However, where do I get these?  I knew that the certification would come from my Windows based CA that runs in the domain, but there wasn’t a tool to generate a certificate request on the BCM.  My clue was that a private key, the key used to generate the request, had to be uploaded as well.  I then used the version of openssl on the BCM to do the work, though in hindsight it probably would have been easier to use a newer version installed elsewhere. 

First, upon doing a version check of openssl I noticed that the working directory that it was looking for (‘c:\openssl\ssl’) didn’t exist.  I manually created the directory and did the work from there.  Eventually I discovered that the ‘openssl.cnf’ file (that was called something else and buried elsewhere on a different drive) that shipped with the BCM was lacking and I ended up brewing my own with the following settings:

[ req ]
default_bits        = 2048
default_keyfile         = privkey.pem
encrypt_rsa_key        = no
default_md        = sha1
distinguished_name        = req_distinguished_name
x509_extensions         = root_ca_extensions

[ req_distinguished_name ]
0.organizationName = Organization Name (company)
organizationalUnitName = Organizational Unit Name (department, division)
emailAddress = Email Address
emailAddress_max = 40
localityName = Locality Name (city, district)
stateOrProvinceName = State or Province Name (full name)
countryName = Country Name (2 letter code)
countryName_min = 2
countryName_max = 2
commonName = Common Name (hostname, IP, or your name)
commonName_max = 64

[ root_ca_extensions ]
basicConstraints        = CA:true

I then executed a command along the lines of the following and filling out the ‘form’ that comes up:

openssl req -new -newkey rsa:1024 -nodes -keyout bcmkey.pem -out bcmreq.pem

I then FTP’d (bad form, but I already said that a different method would have been better) the two files up to my file server.  I put the ‘req’ file through my Windows CA (‘Base 64 encoded’, and unlike the HP ILO card I didn’t need the whole chain) to get the web server certificate and I then uploaded them both up the BCM and viola, the SSL warning error messages were gone and the manager was happy under Windows 7.

Bandwidth Issues Are Odd

A couple of days after I had updated a series of products within the McAfee EPO, I started getting complaints from users about slow access times over the WAN.  After running a technically intensive test (ping) I determined that their complaints were well founded.  In an earlier time I would hop on the router and do who knows what to find the offending party, but I’ve been spoiled these last couple of years by having inaccessible (by me) outsourced routers with our MPLS setup.  Not knowing what was causing the issue I tried toggling some Internet services, investigated file shares, e-mail usage, etc. before taking a ‘what the heck approach’ and stopped the EPO Server service.  The instant I stopped it, the bandwidth issue cleared up.  Started it up, and it comes back.

Thinking that the issue lied with the EPO program itself, I figured the best approach would be to try and upgrade myself out of this issue by moving from EPO 4.0 to EPO 4.5.  This was an event all to it’s own and required a bit of work to get past a database upgrade issue.  After I was done the system came back up and…same issue, the WAN pipe gets completely clogged (apart from our class of service specs of course).  I tried following some bandwidth minimization strategies put forward by McAfee but they weren’t really a good fit for the issue we were having.  I wasn’t getting anywhere with the logs in trying to determine what the huge chunk of data was that being sent into the server, so I fired up network monitor on the off chance that some XML file was being sent in clear text and that it would allow me to determine what the data was.

When I got into the captured data I began scanning some packets, and while none of them were plain text, I did notice that there was a huge disparity in which machines were communicating with the server.  It was so large that it appeared that two PCs, one at each of our remote locations were the sole users of the servers over that brief time.  These PCs were also communicating over port 8085 which is the agent communication port for the EPO server.  I opened the services on the trouble units, stopped the McAfee Framework service and the bandwidth issue cleared up immediately.  I started the services back up and although it took a variable amount of time the bandwidth issue would spring back up.

I’m going to try and redo the agents on the affected system to see if I can clear this issue up…..

UPDATE: Forcing a reinstall of the agent through EPO cleared the issue up on the affected systems.

UPDATE 2: Not so fast!  It appears that for whatever reason my two problem PCs were not applying the second patch for McAfee VirusScan 8.7.  If I had to guess they were constantly trying to download the patch, leading to my bandwidth issue.  The problem now seems permanently cleared up after manually applying the patch to the systems.  The misdiagnosis from the earlier update was caused by a very long lag time from when the agent was installed to when it checked in with the EPO.