News

More than a thousand images of child sexual abuse material were found in a massive public dataset that has been used to train popular AI image-generating models, Stanford Internet Observatory ...
Consisting of 14 million code examples, 500 million lines of code, and 55 programming languages including C++, Java, Python, Go, COBOL, Pascal, and FORTRAN, CodeNet is approximately 10 times ...
A massive open-source AI dataset, LAION-5B, which has been used to train popular AI text-to-image generators like Stable Diffusion 1.5 and Google’s Imagen, contains at least 1,008 instances of ...
LAION-5B, a dataset used by Stable Diffusion, contained thousands of links to child sexual abuse material that could influence AI-image generation.