Data integrationAmount of data in databaseData center vs Data warehouseOacle Data Gaurd related...

Be in awe of my brilliance!

Can hydraulic brake levers get hot when brakes overheat?

Using "wallow" verb with object

Current sense amp + op-amp buffer + ADC: Measuring down to 0 with single supply

Co-worker team leader wants to inject his friend's awful software into our development. What should I say to our common boss?

What is this large pipe coming out of my roof?

I need to drive a 7/16" nut but am unsure how to use the socket I bought for my screwdriver

Life insurance that covers only simultaneous/dual deaths

Should we release the security issues we found in our product as CVE or we can just update those on weekly release notes?

RegionDifference for Cylinder and Cuboid

Identifying the interval from A♭ to D♯

Professor being mistaken for a grad student

It's a yearly task, alright

Have researchers managed to "reverse time"? If so, what does that mean for physics?

Meaning of "SEVERA INDEOVI VAS" from 3rd Century slab

Replacing Windows 7 security updates with anti-virus?

Russian cases: A few examples, I'm really confused

How to answer questions about my characters?

How to generate globally unique ids for different tables of the same database?

Rules about breaking the rules. How do I do it well?

Theorems like the Lovász Local Lemma?

Is it normal that my co-workers at a fitness company criticize my food choices?

Is it possible / allowed to upcast ritual spells?

What are the possible solutions of the given equation?



Data integration


Amount of data in databaseData center vs Data warehouseOacle Data Gaurd related questiondatabase data lost / mysqlManaging data created with an API in multiple environmentsSQL Server seems to be taking 40GB of RAM but keeps reading disk… Why?Large database table queries return empty setHow to debug a SQL script containing INSERT statements with failing subqueries?Integration of ntopng with a big dataHDFS balancing , how to balanced hdfs data?













-1















I am new to the (big)data world, I have a large amount of data (Log data) in 100s of terabyte size in a table format stored on the object store in parquet format. I have list of IP addresses and related details about those IPs in CSV format stored on the same object store.



I want to show its relation in the graphical format.



I want to know what is the optimized way to lookup for the data to find if there is any IP address present in the actual log data from the list of IPs.



I am planning to merge those two details in the third table, want to know if it is a good way to go and create a third big table with all the data from log data table, insert new columns and query for findings
Open for any suggestion









share



























    -1















    I am new to the (big)data world, I have a large amount of data (Log data) in 100s of terabyte size in a table format stored on the object store in parquet format. I have list of IP addresses and related details about those IPs in CSV format stored on the same object store.



    I want to show its relation in the graphical format.



    I want to know what is the optimized way to lookup for the data to find if there is any IP address present in the actual log data from the list of IPs.



    I am planning to merge those two details in the third table, want to know if it is a good way to go and create a third big table with all the data from log data table, insert new columns and query for findings
    Open for any suggestion









    share

























      -1












      -1








      -1








      I am new to the (big)data world, I have a large amount of data (Log data) in 100s of terabyte size in a table format stored on the object store in parquet format. I have list of IP addresses and related details about those IPs in CSV format stored on the same object store.



      I want to show its relation in the graphical format.



      I want to know what is the optimized way to lookup for the data to find if there is any IP address present in the actual log data from the list of IPs.



      I am planning to merge those two details in the third table, want to know if it is a good way to go and create a third big table with all the data from log data table, insert new columns and query for findings
      Open for any suggestion









      share














      I am new to the (big)data world, I have a large amount of data (Log data) in 100s of terabyte size in a table format stored on the object store in parquet format. I have list of IP addresses and related details about those IPs in CSV format stored on the same object store.



      I want to show its relation in the graphical format.



      I want to know what is the optimized way to lookup for the data to find if there is any IP address present in the actual log data from the list of IPs.



      I am planning to merge those two details in the third table, want to know if it is a good way to go and create a third big table with all the data from log data table, insert new columns and query for findings
      Open for any suggestion







      database big-data apache-spark





      share












      share










      share



      share










      asked 5 mins ago









      RohitRohit

      1913




      1913






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "2"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f958377%2fdata-integration%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Server Fault!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f958377%2fdata-integration%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          As a Security Precaution, the user account has been locked The Next CEO of Stack OverflowMS...

          Список ссавців Італії Природоохоронні статуси | Список |...

          Українські прізвища Зміст Історичні відомості |...