Virtual Switching Sanity Check - NFS, BGP & Kubernetes The 2019 Stack Overflow Developer...

Road tyres vs "Street" tyres for charity ride on MTB Tandem

Single author papers against my advisor's will?

Is it ok to offer lower paid work as a trial period before negotiating for a full-time job?

Can withdrawing asylum be illegal?

Why is superheterodyning better than direct conversion?

Segmentation fault output is suppressed when piping stdin into a function. Why?

Searching for a differential characteristic (differential cryptanalysis)

Why can't wing-mounted spoilers be used to steepen approaches?

Derivation tree not rendering

Typeface like Times New Roman but with "tied" percent sign

Am I ethically obligated to go into work on an off day if the reason is sudden?

How is simplicity better than precision and clarity in prose?

Is there a writing software that you can sort scenes like slides in PowerPoint?

system() function string length limit

Simulating Exploding Dice

Did God make two great lights or did He make the great light two?

Windows 10: How to Lock (not sleep) laptop on lid close?

Was credit for the black hole image misattributed?

How can I protect witches in combat who wear limited clothing?

What is this lever in Argentinian toilets?

Can a 1st-level character have an ability score above 18?

ELI5: Why do they say that Israel would have been the fourth country to land a spacecraft on the Moon and why do they call it low cost?

Who or what is the being for whom Being is a question for Heidegger?

Cooking pasta in a water boiler



Virtual Switching Sanity Check - NFS, BGP & Kubernetes



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
Come Celebrate our 10 Year Anniversary!Tagged VLAN with Procurve switch and RHEL is not working as expectedWhy would a server lockup knock other servers off the network?Trouble configuring standard VLANs on BNT G8264 and ESXi 5.5Hyper V 2012 R2 private virtual switch with trunk and access mode switchesUpdate Cisco 3750 for BGPIs there any way to configure a vlan interface on linux to only receive the untagged frames?How to configure 3 vlans on HP Procurve 2920 switch and Gateway is a CISCO routerServer can't see VLAN after changing an IP of something elseMounting NFS servers with Docker/Kubernetes containers, without using insecure mode on the serverFilesystem of Proxmox VMs get corrupted when Ceph Node goes down





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments


This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



I'm looking for a few questions to be answered with this post:




  1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


  2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


  3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


  4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


  5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


  6. Would you recommend a different solution to my problem?











share|improve this question





























    0















    I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



    The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



    - VLAN1: Networking Hardware
    - VLAN20: Physical Machines
    - VLAN40: Kubernetes VMs
    - VLAN50: Internal Kubernetes Deployments
    - VLAN60: External Kubernetes Deployments


    This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



    My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



    When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



    Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



    However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



    My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



    I'm looking for a few questions to be answered with this post:




    1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


    2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


    3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


    4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


    5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


    6. Would you recommend a different solution to my problem?











    share|improve this question

























      0












      0








      0








      I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



      The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



      - VLAN1: Networking Hardware
      - VLAN20: Physical Machines
      - VLAN40: Kubernetes VMs
      - VLAN50: Internal Kubernetes Deployments
      - VLAN60: External Kubernetes Deployments


      This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



      My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



      When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



      Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



      However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



      My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



      I'm looking for a few questions to be answered with this post:




      1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


      2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


      3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


      4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


      5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


      6. Would you recommend a different solution to my problem?











      share|improve this question














      I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



      The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



      - VLAN1: Networking Hardware
      - VLAN20: Physical Machines
      - VLAN40: Kubernetes VMs
      - VLAN50: Internal Kubernetes Deployments
      - VLAN60: External Kubernetes Deployments


      This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



      My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



      When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



      Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



      However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



      My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



      I'm looking for a few questions to be answered with this post:




      1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


      2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


      3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


      4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


      5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


      6. Would you recommend a different solution to my problem?








      networking router kubernetes proxmox bgp






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 22 mins ago









      TJ ZimmermanTJ Zimmerman

      1615




      1615






















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "2"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Server Fault!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          As a Security Precaution, the user account has been locked The Next CEO of Stack OverflowMS...

          Список ссавців Італії Природоохоронні статуси | Список |...

          Українські прізвища Зміст Історичні відомості |...