Value is subjective. Therefore the value of information is also subjective. This means that, given enough diversity in the preferences of people, no objective scoring of information can be robust in providing value.
Instead, subjective scoring is needed. Omahaku aims to provide the means to rate information, information sources and other information raters to fully control your information intake.
Our current reputation and matching algorithms use different forms of memory-based collaborative filtering (CF) techniques. Memory-based CF seems like a good choice when the user needs to have precise control over their recommendations. There are some challenges to this approach though.
Giving the user control of a recommendation system has its drawbacks. Using such a system requires more effort, and the algorithms must be simple enough to be controllable by the user. If the system is too difficult to use, it simply won't work. The UI and the algorithms are strongly coupled to each other, which is not the case with recommendation systems where the user has less control. The main design problem is creating a convenient UI for controlling a useful recommendation algorithm.
The user doesn't get recommendations only based on the ratings of like-minded users, which may not always exist due to data sparsity. They also get them based on 1) topics they've rated 2) information sources they've rated, and most importantly 3) from other users they've rated positively. With these systems one rating can yield thousands of good recommendations even if rating data is sparse. A single positive rating of a new item is enough to make it reputable in the trusted network (given that there's no other ratings interfering). Gray sheep still get recommendations from their network and the cold start lasts for only a few ratings.
Filtering out manufactured recommendations is partially solved by limiting the CF neighborhood to the user's trusted network. The trusted network produces one component of the reputation score. If a herd of shilling accounts appear and start their attack, their ratings don't have an effect on the network scores. Only if an authentic user takes a shilling account into their trusted network, can their attack start to take hold. If such infiltration occurs and one gets a bad recommendation, they can mark the recommender as not trustworthy, removing them from their trusted network. The cost of creating an effective shilling attack becomes high and the cost of shutting it down becomes low.
Scaling the system to millions of users and items while keeping to a memory-based CF might be impossible. It might be that memory-based CF must be limited and augmented with a better scaling recommendation system (in a way that doesn't degrade recommendation quality too much). That said, if the service manages to provide very valuable information, it being more expensive to operate than e.g. traditional search engines might be justified.
The user can choose the sharing settings for each item (its rating and categorization). Currently the service has four settings for rating visibility: private, friends, network and public.
There's too much data on the internet to be indexed by any search engine. Omahaku too must at some point limit how many items it contains. However, for the system to work, anyone should still be able to add a link they think is valuable to the system. One solution is to give each natural person a link budget. This and the number of people in the world would cap how many pages Omahaku indexes. To increase one's allocation, one would buy link space from others in an auction. This way 1) any person could add an important page to the index via their personal link budget, 2) companies could buy large amounts of link space to get their catalogs etc. indexed, and 3) the index size would remain capped, solving the problem.
Near-future development goals:
The company is owned by me, Tuukka Pensala. I'm also the developer. See my profile