One million (small web) screenshots

(nry.me)

166 points | by squidhunter 46 days ago

13 comments

  • foxfired 42 days ago
    I found my own blog [0]. But interestingly, it is missing the letter I in screenshot starting from July 2025.

    [0]: https://onemillionscreenshots.com/idiallo.com/screenshot

    • KomoD 40 days ago
      If it was unclear, OP's site is screenshots.nry.me, not onemillionscreenshots.com
  • chrismorgan 42 days ago
    There are many patches of almost-identical sites.

    Some of them are due to many people using the same theme.

    Some of them are expired or parked domains, which I reckon should be detected and excluded.

    • coldpie 41 days ago
      Yeah those clusters are interesting. They stand out, so they are the first thing I zoomed in on, then I realized they're all just stock resume sites. Quickly realize the clusters are something to avoid. Turns out to be an effective visualization method.
      • chrismorgan 41 days ago
        The thing I find interesting is where the grouping is robust to colour variations: one of the bigger groups is around 25% from left, 20% from bottom, all one theme but in a wide variety of colours.
    • Thorrez 41 days ago
      Yeah, I wonder why parked domains are included. Are there not at least 1 million actual websites?
    • stackghost 42 days ago
      >Some of them are due to many people using the same theme.

      Teeming masses of sites using what probably seems to the authors as a fresh, unconventional look but ends up being Yet Another.

      • arjie 42 days ago
        I doubt anyone selecting a popular theme is confused by the fact that it’s popular. I use the default Mediawiki theme for mine, for instance.
  • frankcaron 41 days ago
    I found my own site, as well, and I found that particularly charming.

    Recently went back [0] to the open web and feel like this inclusion alone justified that move.

    Thanks for sharing. Humble and heart-warming way to end 2025 for an old Internet man.

    [0]: https://frankycaron.medium.com/of-an-open-web-rebirth-and-bi...

  • vintagedave 41 days ago
    I’m curious how the choice of which blog is located next to which was made. The writeup mentions “dimensionality”. I found my blog, and the eight surrounding it are interesting people, but every one of them is an AI researcher with degrees from Berkeley or similar, and the sites are predominantly CVs.

    Luminous company but not my level, nor is my blog about AI, nor is it a CV. I can’t see any reason for the location.

    • coldpie 41 days ago
      I think it is literally by the colors of the screenshots. Nothing to do with the contents.

      > I just want to encode the high level aesthetic details of webpage screenshots. Because of this, I fell back on an old friend: the triplet loss on top of a small encoder. The resulting output dimension of 64 afforded ample room for describing the visual range while maintaining a considerably smaller footprint.

  • jot 42 days ago
    So good to see this different approach! The clustering looks really cool and love that the focus is not on the most popular websites.

    Here’s another Christmassy alternative: https://display.archive.org/xmas

    I’m one of the makers of OneMillionScreenshots.com and I’m currently working on an update to it.

  • AndrewStephens 42 days ago
    I started by finding my own blog and scrolling north, south, east, and west to see my neighbours. I’ve already found several interesting sites and a new person to follow on mastodon.

    It’s a shame there doesn’t seem to be any way to link to a particular position on the map but great stuff nevertheless.

  • cosmicgadget 42 days ago
    That's a lot of fun to explore. I'm not entirely convinced by the "you can judge a book by its cover" thing, there are so many "Hi, I'm _____" pages that might have content or might just be portfolio stubs.
  • yoyo250 42 days ago
    Maybe can add a timeline and clock

    Timeline: view older versions

    Clock: view light/dark mode theme according to user time zone (or enable dark/light mode manually)

    I'm also a bit curious, since most web pages are predominantly white, how many of them are adapted to dark mode?

  • ErroneousBosh 42 days ago
    Shit, my blog is on there. I should post on it more frequently than once every two years.

    My forum isn't, though. With a post every day or so and nearly 50 active users, it's probably not "small web" any more :-D

  • nickradford 41 days ago
    Oh wow, my little profile/blog is on here, nice!
  • troupo 41 days ago
    Long tail on the internet is truly long.

    My website [1] gets perhaps as many as 200 visitors a week according to Cloudflare. And it's still there at number 399322 (first half of the pack).

    [1] https://onemillionscreenshots.com/dmitriid.com

  • nathaah3 42 days ago
    i didn't know about onemillioscreenshots before but..

    this is one of the coolest blogs i have ever read!

  • ctxc 41 days ago
    Very surprised to see my website on there. But I'm assuming it's >6 months old because I went batshit crazy on the UI recently.
    • elaus 41 days ago
      Fyi the link in your HN profile 404s (but the website looks nice, good work!)
      • ctxc 40 days ago
        Thank you so much!

        Nice catch, fixing :D