 2021-05-25, 05:09 #1 MooMoo2     Aug 2010 659 Posts Could you fit the entire public Internet in your home? Suppose you could instantly download all of the publically accessible data on the Internet onto some hardware. That includes things like most cat videos, Twitter rants, and porn, but doesn't include things like email messages, social media accounts set to "private", online banking info, or anything located behind a paywall and/or requiring a login to view. Assume that duplicate content (i.e., the same video or picture posted multiple times) can somehow be automatically identified and not downloaded more than once. With that said, could you fit all of that hardware inside your home? You are allowed to remove your furniture and other belongings, but you're not allowed to store the hardware outdoors (backyards, patios, etc.). I wouldn't consider my house to be really big, but I think I might be able to pull it off by filling it from floor to ceiling with nothing but portable external hard drives.
 2021-05-25, 05:32 #2 retina Undefined     "The unspeakable one" Jun 2006 My evil lair 188816 Posts How much data is "the publically accessible data on the Internet"? I'm quite sure the number changes very frequently, both up and down. Without more information it is impossible to give an answer. Last fiddled with by retina on 2021-05-25 at 08:47 Reason: "is it" ---> "it is"
 Originally Posted by retina How much data is "the publically accessible data on the Internet"? I'm quite sure the number changes very frequently, both up and down. Without more information is it impossible to give an answer.
I agree.

I'd give serious consideration to microSD for their storage density. The original question did not specify access time to any particular part of the data.

 2021-05-25, 11:03 #4 Dr Sardonicus     Feb 2017 Nowhere 13·383 Posts For a real-world reference, See "Internet Archive Wayback Machine." The Internet Archive has over 20 years worth of web pages stored. I don't know how much room a day's haul takes up. Alas, when the Archive created an "emergency library" during the pandemic in 2020, a group of large publishers filed a lawsuit for "mass copyright infringement," and the Archive closed the library. Efforts to settle the case have so far failed. Trial is set for November 2021. Litigating the case could bankrupt the Internet Archive. Parents, please note: If you're thinking about reading bedtime stories aloud to your kids, think again.
I think this is relevant:

 The Google Search index contains hundreds of billions of webpages and is well over 100,000,000 gigabytes in size.
https://www.kevin-indig.com/blog/goo...t-grow-at-all/
Google caches a copy of the pages it indexes and the largest page it has been able to cache/index was 977 kilobytes.

ETA So it should not include decimal expansions of record primes, or other very lage documents which may or may not make a big difference (probably does).

Last fiddled with by a1call on 2021-05-25 at 11:50

 Originally Posted by Dr Sardonicus For a real-world reference, See "Internet Archive Wayback Machine." The Internet Archive has over 20 years worth of web pages stored. I don't know how much room a day's haul takes up.
The entire Internet Archive is "only" about 90 petabytes:
https://www.protocol.com/amp/interne...ure-2650997964

I have one 5 TB hard drive on my desk, which can store ~4.54 TB of actual data. So 20,000 of those hard drives would be needed.

That 5 TB hard drive is 3" wide x 4.5" long x 0.8" high (~11 cubic inches). 20,000 hard drives would be ~220,000 cubic inches / ~128 cubic feet, which can easily fit into a small room.

 Originally Posted by retina How much data is "the publically accessible data on the Internet"? I'm quite sure the number changes very frequently, both up and down. Without more information it is impossible to give an answer.
That's the main point of the thread Although that number does change very frequently, it might be possible to get a rough estimate to within an order of magnitude.

Estimating a typical house's storage is the easy part. A 2000 square foot house with 8' high ceilings is 16,000 cubic feet. At ~157 8 TB hard drives per cubic foot, that's ~2.5 million hard drives in a house, or about 20 exabytes.

 Originally Posted by xilman I'd give serious consideration to microSD for their storage density. The original question did not specify access time to any particular part of the data.
That is correct; you're allowed to use microSD instead of hard drives.

Last fiddled with by MooMoo2 on 2021-05-25 at 15:53 Reason: Typo

 2021-05-25, 15:52 #8 kriesel     "TF79LL86GIMPS96gpu17" Mar 2017 US midwest 22×5×172 Posts Approximating the internal storage volume of a 3BR home as 2.3M high x 7M wide x 13M long x 2 floors = ~419 cubic meters (without including an attached garage), and the amount of data to be stored as 20 Zettabyte http://www.live-counter.com/how-big-is-the-internet/, we need a data storage density of 20 ZB /(419m^3) = 20,000,000,000 TB/(419m^3) ~ 48,000,000 TB/m^3. A 2TB M2 external SSD is ~10.7x3.2x1.05 cm^3 (without cabling, racking etc), so 2 TB / 3.6e-6m^3 = 556,300. TB/m^3. That's a problem, 86 times more volume per TB than available. MicroSD at 1TB, 16x12x1mm^3 is ~5.3E6 TB/m^3. Still too bulky, by about a factor of 9.2. That's for cold disconnected offline storage. If you want it online, with cabling, cooling, systems it's interfaced to, it would require considerably higher storage density devices. Cost is another matter. At ~$20/TB, it's$400,000,000,000 for the drives, assuming that the volume of purchase does not raise prices. Compare to the assets of these individuals. Allowing for growth, you're going to need ~70TB/second, and increasing at Moore's Law rates, 560,000 times gigabit fiber speed. So, I conclude an offline snapshot could be fit in a neighborhood or large warehouse, offline, but budget is a problem, as is updating. Last fiddled with by kriesel on 2021-05-25 at 15:52
 2021-05-25, 18:42 #9 Uncwilly 6809 > 6502     """"""""""""""""""" Aug 2003 101×103 Posts 13·769 Posts would 20,000 HDD surpass the carrying capacity of the floor? Same question dealing with Ken's numbers.
 Originally Posted by Uncwilly would 20,000 HDD surpass the carrying capacity of the floor? Same question dealing with Ken's numbers.
https://www.huduser.gov/Publications/pdf/strdesign.pdf gives a table of "live load" showing 30 lb/sq ft or more is applicable to most rooms in one or two family residences, but not attics. A floor to ceiling dense stack of microSD silicon and plastic would weigh far more than that. Plastic's density is typically near 1 g/cc, and silicon is higher. Some landlords prohibit waterbeds, for load limit reasons or water damage potential.
Concrete pads or footings directly on soil or rock have far higher ratings. (2000 lb/sq ft for clay, higher for others) So the bottom floor should be fine, as long as you don't care about the floor finish or possible damage from groundwater seepage. https://www.concretenetwork.com/conc...ils_matter.htm
There are other aspects to consider too. https://youtu.be/DCw1NbrQtqE

Last fiddled with by kriesel on 2021-05-25 at 19:54

 2021-05-25, 21:16 #11 firejuggler     Apr 2010 Over the rainbow 2×33×72 Posts semi-revelant, xkcd again https://what-if.xkcd.com/31/

