Most recent posts...

Complexity – My Friend, My Enemy
Over my years of network engineering, I've learned that the fewer features you can implement while still achieving a business goal, the better. Why? Fewer
Brief Me At Interop?
Want to brief me about your product or otherwise have a chat? Send an e-mail to [email protected] while there's still room on my calendar. I

Questions I’m Asking Myself About SD-WAN Solutions

Plan about 4 minute(s) to read this.

There’s a lot to SD-WAN technology, and there are a number of devils in the implementation details that have been nagging at me as I listen to briefings and record podcasts from vendors in the space.

SD-WAN technology questions that bother me…

  1. What’s the impact to hosts on virtual machine based endpoints, i.e. how much CPU does an SD-WAN VM eat for solutions that use VMs? Not a simple question to answer anymore, as there’s usually cryptography involved.
  2. How much latency does the SD-WAN controller introduce, and under what circumstances?
  3. When WAN-based SD-WAN tunnel endpoints are inevitably separated from the controller due to a network fault, what happens?
  4. How does the SD-WAN infrastructure track tunnel availability, and how quickly does the controller react when a tunnel is down?
  5. What happens to in-flight traffic when a tunnel dies? (Every financial services organization is going to ask that question.)
  6. How does the SD-WAN solution get traffic into the system? As in, routers attract traffic by being default gateways or being in the best path for a remote destination. SD-WAN tunnel endpoints need to attract traffic somehow, just like a WAN optimizer would. How is it done? WCCP? PBR? Static routing? (All 3 of those are mostly awful if you think about them for about 2.5 seconds.) Or do the SD-WAN endpoints interact with the underlay routing system with BGP or OSPF and advertise low cost routes across tunnels? Or are they placed inline? Or is some other method used?
  7. What about traffic I don’t want to go through the overlay fabric? How do I exempt it?
  8. Double-encryption is often a bad thing for application performance. Can certain traffic flows be exempted from encryption? As in, encrypted application traffic is tunneled across the overlay fabric, but not encrypted a second time by the tunnel?
  9. Is the encapsulation type standard or proprietary? If it’s proprietary, convince me I don’t care.
  10. Assuming unique keys per tunnel (and I’d hate to imagine a single key per tunnel fabric), how are these keys managed and by whom?
  11. Is path symmetry important when traversing an SD-WAN infrastructure? Why or why not? Depending on how the controller handles flow state and reflects it to various endpoints in the tunnel overlay fabric, this could be an interesting answer.
  12. Selectively forcing certain flows to traverse firewalls or other security devices is part of the SD-WAN unicorn. How, exactly, does this happen, and what are the network underlay dependencies required to bring it about? Ergo, SD-WAN service chaining differs from service chaining through a hypervisor-based vSwitch where a controller can direct the traffic inside of a nice, tidy ecosystem wherever it wants. SD-WAN service chaining has to work on traditional IP fabrics that have no inherent notion of service chaining, and all you’ve got to work with are overlay tunnel endpoints.
  13. Just how granularly can I identify applications, considering progressively more applications are encrypted as they traverse the wire?

There’s probably more questions I haven’t thought of yet, and I know from briefings that these questions have answers. But I’ve also noticed that they are not always the same answers – the solutions have some interesting variability. For all the problems SD-WAN assist with, no technology is without its own complexity of some sort. Everything is a trade-off. I see huge value in SD-WAN, but also see a bit of implementation pain. Call me paranoid.

If I was evaluating SD-WAN…

If I was evaluating SD-WAN, I’d be asking these questions and more in the exploration phase. Then if I moved into a trial phase, I’d make a long list of specific business goals to meet and application behaviors to expect when implementing the solution. And then I’d get medieval, breaking it any way that I could think of to see how the system recovers, up to and including blasting the SD-WAN endpoints with both too much volume and too many unique flows. What happens when you try to kill the tunnel endpoints? All useful stuff to find out before you commit to a vendor providing you with technology you’ll likely come to rely on heavily once it’s in place and working.

I would love to hear your SD-WAN “technology in production” stories in the comments below.

SD-WAN vendors include…

Vendors I am aware of in the SD-WAN space (to some degree of product offering or other) include Viptela, CloudGenix, Talari, VeloCloud, SilverPeak, Riverbed, and Cisco.