The Importance of Flexibility

November 5th, 2008

Sleenk is intended to accept input from a variety of different sources. During the original design process, a major hurdle became apparent.  How would it be possible to accommodate users who didn’t want to dedicate much time to Sleenk? Additionally, despite the attention being paid to ensuring the privacy of every end-user, it seemed inevitable that certain users would be unwilling to allow Sleenk to automatically collect information. Would it be possible to provide those users with a solution? The most obvious answer, it seemed, would be to provide every Sleenk user with a choice.

A core principle of Sleenk is that the system should never expect to consume more than a few minutes of a user’s time in order to produce meaningful results. Thus, the idea of a Sleenk Bar was born. The Sleenk Bar runs in the background, autonomously monitoring a user’s browsing activity, while still allowing them to exclude certain sites from being shown to others or calculated as part of their strand.

While the Sleenk Bar provides a good way to provide the system with input, some people would rather not install third-party software, and that is perfectly understandable. While Sleenk products do not contain any spyware or viruses, it’s true that many third-party offerings do, and many users are understandably skeptical. The Sleenk portal collects information in an active manner, requiring the user to visit and provide the system with feedback on a regular basis. However, that user is not required to install any software in order to use Sleenk, and it is generally hoped that the user will establish a high level of trust in the system after using it for a while. The Sleenk portal is also available to those using the Sleenk Bar, doubling as a home page to those users.

It’s important to give the users of a system as much flexibility as possible, and that is exactly what the design of Sleenk tries to accomplish. In the future, this blog will touch on more examples of options given to users within the system.

The User Experience

October 28th, 2008

In order to be useful, a service must be fast, intuitive, and largely automated. Those are the principles that have been adopted in the development of Sleenk. Nobody will use the system if they must, for instance, constantly input their own data. This particular issue can be resolved by offering software designed to automate that work.

Additionally, it must be intuitive. Nobody should need to know the meanings of seed, sprout, and strand — much less anything about the complicated work done on the backend to produce and organize scores — in order to get something out of the system. A new user should be able to sit down, install the optional Sleenk software, and begin using the system. The complete account creation process should take no longer than 5 minutes.

Finally, the interface must be unified. There’s nothing worse than a Web design or software interface with mismatched icons, varying default color schemes, and the use of different terms that mean exactly the same thing. If an icon represents a specific feature on the Web site, then it should also represent that function in software. A user shouldn’t need to remember the aesthetic differences between different areas of the system.

While this entry may be short, it says a lot about how Sleenk is developed, down to the very name of the system. We ultimately live in a society where free time is a valuable commodity, and it’s important that a system like Sleenk provides you with the maximum return possible on your investment.

Running from the Bombs

October 26th, 2008

Google bombs, that is. While there are many other terms that refer to the manipulation of how a particular resource is perceived by a search engine or other Web service, perhaps that one is the most infamous. Granted, these rules don’t just apply to Google. Wikipedia, one of the most successful projects to rely primarily on user-generated content, has also experienced floods of vandalism in the past. A malicious user might modify a neutral Wikipedia article to better enforce his or her own perspective, or to promote a political agenda. Now and then, vandals attack the online encyclopedia without a true reason. It’s just good for a laugh.

This presents an interesting problem. After all, you have to trust some user input in order to operate a system that runs on user input. The issue, of course, is deciding which user input is trustworthy. That’s why Sleenk tends to calculate a user’s reputation based on his or her strand, among many other factors. Since we’re nearing the United States presidential election, I’ll present an example of a scenario that could occur without these controls in place.

Let’s say that there are two political candidates, Joe Plumber and Jane Executive. Joe’s campaign knows that a good number of Jane’s followers use Sleenk, and he wants misinformation to be recommended to those users by the system. He could simply have his staff members register a large number — depending on the resource — of Sleenk accounts, seed them with data that would cause Sleenk to “think” that they’re similar to Jane’s followers, and then establish many connections to the resources containing misinformation.

Fortunately, Sleenk has been designed with this kind of scenario in mind, and searches for patterns of use that might indicate attempts to manipulate the system. By monitoring the types of Sleenk users accessing a specific resource, as well as how similar they are to accounts that have historically accessed the resource, it can assign more or less weight to their usage. Hence, Joe’s users would be assigned much less weight, and their attempt would be cancelled out by the normal references (i.e., signal to noise) to the resource.

In other words, while 5% of the Democratic Party might shift to the Republican Party over the span of a decade, it’s extremely unlikely that 35% of the Republican Party would shift to the Democratic Party overnight. Sleenk adheres to this principle, and assigns more weight to long-term trends. While it’s obvious that change occurs, it still tends to occur slowly, and it’s easy to foil a number of potential attacks on the system by simply observing this rule.

Your Brain on Sleenk

October 25th, 2008

In building Sleenk, an obvious challenge was to overcome inevitable problems associated with false positives. If a user were to visit a resource only once, for example, should that visit affect the user’s strand calculations as much as a resource that is visited frequently? Probably not, as the system would no longer be accurate if this particular issue were overlooked.

Therefore, when designing the system, a great deal of inspiration was taken from the cellular mechanisms that facilitate learning and memory in the human brain, specifically the process called long-term potentiation, which refers to a persistent increase of synaptic strength (”potentiation”) over time. While Sleenk does use a number of other factors to determine the weight assigned to a specific resource, this one is among the most important.

By observing over time which resources are accessed most frequently and determining similarities in the user’s strand score, the strand scores of other users who frequently access that resource, and the Sleenk scoring of the resource itself, Sleenk can provide a much more accurate picture of what interests the user. Thus, more weight is applied to the connections deemed most relevant.

Over time, there will be more elaboration in this blog about how Sleenk relates to other natural processes, and why those were chosen specifically to improve the accuracy of the system. After all, natural selection might be considered the most ambitious “development process” ever undertaken, so why not take advantage of progress made over the past few billion years?

Privacy Controls in Sleenk

October 23rd, 2008

By design, Sleenk requires a lot of information to work properly. Your strand (Sleenk profile) says an awful lot about you, including some things that you may not want others to know. That’s why privacy controls are paramount in a system like this one. Sleenk allows a user to hide his or her strand completely, open certain parts of it, or leave everything public. In addition to the controls described below, users may always access Sleenk using SSL (a layer of strong encryption) so that traffic flowing to and from Sleenk cannot be intercepted by a third party.

When a user chooses to hide his or her Sleenk data completely, it’s anonymized in a form that cannot be reversed unless the user logs in. This prevents anyone other than the user, including a person with direct access to the Sleenk database, from being able to trivially associate the data with the user account. An anonymized user is still able to use Sleenk and derive many of its benefits, but the social aspect of using the system is diminished.

The second option provides a good middle ground by allowing the user to involve him or herself socially with Sleenk, but not in a way that might prove embarrassing. We accomplish this by using an “Awkward List,” which contains resources deemed potentially unflattering. The user can activate the global Awkward List, which will shield those resources from public view, as well as add sites to his or her own personal Awkward List. The Awkward List is available no matter which privacy options have been chosen.

The “third option” essentially describes an account with all privacy controls disabled. Any other users may view your profile and the information gathered by Sleenk.

Spanking Seed Questions

October 22nd, 2008

Here’s the first post on Sleenky. With any luck, there will be many more. For the time being, this mostly serves as a development journal, so keep reading if you’re interested in how this all came about.

Seed questions allow a new user to start using Sleenk quickly. Without them, Sleenk would need between one and four weeks — depending on usage — to gather enough information to begin reliably delivering results. The answers to these questions allow Sleenk to build a sprout, which contains preliminary information provided to Sleenk by the user, and basically allows the user to begin using Sleenk immediately. Later on, a user (as well as Sleenk developers, albeit in a nonidentifiable fashion) can determine how closely his or her sprout matches the actual Sleenk profile. Not only is it fun for the user, but it allows for the improvement of Sleenk’s accuracy through analysis.

When asked alone, the seed questions are not particularly useful, but the sprout is built as a combination of how the seeds are answered. Once the new user begins using Sleenk, the sprout is immediately altered by his or her usage patterns, and gradually evolves into a hybrid sprout. After enough time has passed and the user has provided enough input to the Sleenk engine, the sprout is no longer used to generate feedback. This stage, which represents a mature Sleenk profile, is called a strand (or mature strand).

So, what exactly does a seed question look like? Typically, they are designed to result in very polarized responses, which is why a combination of seeds is required to form a sprout. Here’s an example: “It is a healthy and normal practice to spank children. Agree or disagree?” Really? Shame on you. ;-)

Obviously, the answer to that question alone would not result in a useful sprout. However, as the new user is prompted with multiple seed questions, patterns that are considered useful to Sleenk gradually emerge.