Monday, April 21, 2008

new search!

New search has gone up!

As one of my first tasks as a new employee, I've totally replaced the reddit search function. (Just in time, it seems.). New index, new search library, all new articles! Okay, I can't take credit for the new articles. That's all you.

The new search code is tuned with the assumption that there is one article that you are looking for when you search. It also assumes that that article was relatively recent, and got enough points that you were likely to see it. So it's tuned to find that one result for you. It's not that great for browsing by keyword (run a search for "vote up if" and you'll see that you won't get all of the posts with that term on the first page of results), but it should be great for, you know, searching.

Oh, and you can now search for C++. You're welcome, C++ programmers that have been complaining about that :)

Some tips:

  • The URL is included in the searched fields. If you know that article was posted to a blogspot domain, put "blogspot" in the search field example

  • ...so is the reddit. If you put "politics" in the search field, it will weight articles from that reddit. example

  • ...so is the username that posted it example

  • The + and - operators are supported, as well as the AND and OR modifiers example

  • The more keywords, the better. By far. Stemming is done, so you can approximate within reason (see the first few results of this search and compare to the search terms)

  • Quotes help, too example

  • If you know it was relatively recent, try limiting the time-period with the "links from" on the side-bar example

  • You can search for reddits. That's actually been there for a while, but I'd like to believe that the results are better now


If it's still not good enough (will it ever be good enough for redditors?), there are some tuneables in there that can be tweaked so that it works the best for the greatest number of users without another re-write (for instance, the weighting for the current reddit, score and newness, whether a search defaults to a weighted OR or an AND, etc). The various tuneables have trade-offs, and it may take some time to get it right, but I'm very confident about the new code. I welcome good-natured feedback. It helps. Really. Maybe my assumptions about recentness and points are broken. Tell me. Send feedback about a search that failed to return what you expected.
discuss this post on reddit