TechCrunch visualises malware data volumes using hard drive stacks
A manual calculation reveals that VirusTotal’s 31 petabytes of samples would stack to 2,645 feet, while vx-underground’s 30 terabytes amounts to just 30 inches.

TechCrunch has published a visualisation illustrating the sheer physical scale of malware data held by major cybersecurity entities, using a stack of 1-terabyte hard drives to represent the volume. The analysis highlights that vx-underground, a malware research group claiming the largest collection of malware source code, holds approximately 30 terabytes of data. In comparison, VirusTotal, an online service founded by Bernardo Quintero that scans files across multiple antivirus engines, stores about 31 petabytes of samples contributed by users.
The publication performed a manual calculation after an AI chatbot provided an incorrect answer to the same query. The visualisation assumes the use of standardised 3.5-inch internal hard drives with a height of 1 inch. While the article notes that the usable file capacity of a hard drive is generally less than its advertised capacity, the calculation uses the advertised 1-terabyte figure for simplicity.
Under these assumptions, vx-underground’s 30 terabytes of malware source code would stack to approximately 30 inches, or 2.5 feet. For visual reference, the article compares this height to a reporter standing at 6 feet tall. The data volumes are described as approximate figures rather than precise audits, acknowledging that the physical height calculation relies on idealised drive dimensions.
VirusTotal’s 31 petabytes of malware samples would stack to approximately 2,645 feet. To contextualise this height, TechCrunch compared the stack to global landmarks, noting that the Burj Khalifa in Dubai stands at 2,722 feet and the Eiffel Tower is 1,083 feet tall. Consequently, the VirusTotal data stack is roughly equivalent to two-and-a-half Eiffel Towers.
These repositories serve as critical resources for cybersecurity companies, AI researchers, and threat intelligence firms. They utilise these datasets to train detection models and understand evolving attack vectors. The comparison underscores the exponential growth in digital threat intelligence, with a petabyte being approximately 1,000 times larger than a terabyte.


