Virtual Switching Sanity Check - NFS, BGP & Kubernetes The 2019 Stack Overflow Developer...
Road tyres vs "Street" tyres for charity ride on MTB Tandem
Single author papers against my advisor's will?
Is it ok to offer lower paid work as a trial period before negotiating for a full-time job?
Can withdrawing asylum be illegal?
Why is superheterodyning better than direct conversion?
Segmentation fault output is suppressed when piping stdin into a function. Why?
Searching for a differential characteristic (differential cryptanalysis)
Why can't wing-mounted spoilers be used to steepen approaches?
Derivation tree not rendering
Typeface like Times New Roman but with "tied" percent sign
Am I ethically obligated to go into work on an off day if the reason is sudden?
How is simplicity better than precision and clarity in prose?
Is there a writing software that you can sort scenes like slides in PowerPoint?
system() function string length limit
Simulating Exploding Dice
Did God make two great lights or did He make the great light two?
Windows 10: How to Lock (not sleep) laptop on lid close?
Was credit for the black hole image misattributed?
How can I protect witches in combat who wear limited clothing?
What is this lever in Argentinian toilets?
Can a 1st-level character have an ability score above 18?
ELI5: Why do they say that Israel would have been the fourth country to land a spacecraft on the Moon and why do they call it low cost?
Who or what is the being for whom Being is a question for Heidegger?
Cooking pasta in a water boiler
Virtual Switching Sanity Check - NFS, BGP & Kubernetes
The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
Come Celebrate our 10 Year Anniversary!Tagged VLAN with Procurve switch and RHEL is not working as expectedWhy would a server lockup knock other servers off the network?Trouble configuring standard VLANs on BNT G8264 and ESXi 5.5Hyper V 2012 R2 private virtual switch with trunk and access mode switchesUpdate Cisco 3750 for BGPIs there any way to configure a vlan interface on linux to only receive the untagged frames?How to configure 3 vlans on HP Procurve 2920 switch and Gateway is a CISCO routerServer can't see VLAN after changing an IP of something elseMounting NFS servers with Docker/Kubernetes containers, without using insecure mode on the serverFilesystem of Proxmox VMs get corrupted when Ceph Node goes down
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.
The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:
- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments
This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.
My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.
When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.
My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.
I'm looking for a few questions to be answered with this post:
I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.
Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?
If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?
What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.
Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?
Would you recommend a different solution to my problem?
networking router kubernetes proxmox bgp
add a comment |
I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.
The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:
- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments
This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.
My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.
When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.
My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.
I'm looking for a few questions to be answered with this post:
I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.
Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?
If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?
What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.
Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?
Would you recommend a different solution to my problem?
networking router kubernetes proxmox bgp
add a comment |
I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.
The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:
- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments
This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.
My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.
When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.
My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.
I'm looking for a few questions to be answered with this post:
I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.
Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?
If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?
What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.
Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?
Would you recommend a different solution to my problem?
networking router kubernetes proxmox bgp
I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.
The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:
- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments
This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.
My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.
When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.
However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.
My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.
I'm looking for a few questions to be answered with this post:
I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.
Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?
If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?
What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.
Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?
Would you recommend a different solution to my problem?
networking router kubernetes proxmox bgp
networking router kubernetes proxmox bgp
asked 22 mins ago
TJ ZimmermanTJ Zimmerman
1615
1615
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "2"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Server Fault!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown