Create your own Search Engine with Yahoo! BOSS

Yahoo! recently launched BOSS, which opened their search platform to developers around the world. They didn’t just create an API to access data. That has been around for a while. Yahoo! has opened the data to developers with no limits on requests, no restrictions of icon use, results display, or even the need to let people know the search engine is Yahoo! based.

Yahoo BOSS plus your special sauce equals search innovation

This open approach lets anyone build a search engine to their particular skills, mash the data with other sources, re-arrange results, or any other novel idea for the next king of search. You could also use BOSS to add search capabilities to a pre-existing site, limiting the resuts to just its data.

Why is Yahoo! doing this? It’s a brash approach to push search beyond its current status of pages with ordered sets of links You can let your imagine fly with the only cost being your personal development investment. I recently took this challenge and decided to build a search engine for vegetarians.

V3GGIE – A Vegetarian Search Engine

My goals were simple:

  1. Create a simple site that could be copied easily as a proof of concept for other genres. Document the construction for others to explore.
  2. Keep it fast with minimal javascript and images
  3. Use as much Yahoo infrastructure as possible to minimize development time
  4. Most importantly: return information relevant to the niche audience: Vegetarians and Vegans.

I’m not a PHP expert and some of my code is crude. I hope to clean it up and add a number of features to enhance performance and usability. However, the code samples will still be useful to the PHP beginner. More advanced PHP programmers could easily see where they’d take the concepts and improve on them.

Set up the basic structure

V3ggie has a basic workflow, there’s an input and result page. Arguably, this should only be one page which displays the original landing and subsequent results. I have separated them as I hope to create extra content that is appropriate to either the landing or results pages.

Further, there are several search engines built into this site. Each has a specific set of resources to fine tune the results. Currently, these are built with subdirectories (/recipes/, /blogs/, /news/, /local/ ). Each subsection includes index and result pages. This could be changed by utilizing rewrite rules. I’ve kept it simple for now.

Setup the Resources

The BOSS API allows you to create a query param with a list of domains to search through. This is the easiest way to fine tune your results. For instance, the V3ggie recipe search page uses a list of vegetarian cooking sites as well as the vegetarian subdirectories of Epicurious and FoodNetwork.

Technorati is the source for blog buzz. V3ggie searches through blogs tagged with “vegetarian” and/or “vegan”. This helps get the vegetarian viewpoint for any subject.

You have complete freedom to mash the data as much as you like. You could take the search results and mix them with other data, such as the page rank for a result page, the company or product’s appearance on wikipedia, or perhaps data you’ve stored in your own databases. I can imagine creating an internal product search page that cross-references the results with a list of preferred vendors to encourage employees to purchase supplies from the correct vendor.

Setup the Platform

V3ggie is built from PHP. However, Yahoo! has also provided a python platform, the BOSS Mashup Framework, for building sites very quickly. You can combine this with the Google App Engine to create a custom search engine in a short time. Four Hour Search, formally known as Yuil, is such an example. It got its name from the length of time it required to research a domain name, setup Google Apps, and build the final search web site. Personally, I spent more than 4 hours trying to get Google Apps set up and I really didn’t want to learn yet another language (python).

YUI on the frontend

The Yahoo! User Interface library handles the tedious, basic formatting of a page. The CSS libraries allow you to create a wide variety of page grids, standardize the fonts, reset browser inconsistencies and establish a common look and feel.

I started the project by using the CSS Grid Builder. This easy to use tool sets up the page with the desired columns and includes the base css files. I then added the YUI Base CSS file. This takes the plain page and re-establishes the margins and font-styles for a basic site. These two CSS files will remove 75% of the CSS you would normally have to write for a site. Now you can concentrate on what makes your site special.

I also wanted to offer different seach options from a single interface. Once again, I used the YUI TabView package. This combination of CSS and JS allows you to create a semantic set of links and corresponding series of div wrapped objects. YUI Tab View will turn this into the tabbed interface that even has built in ARIA support for screen readers.

I had some trouble getting the tabs to look correct. The documentation does not make it very clear that tab links must have an em within the links to get the proper look and feel. I also downloaded the preferred sprite and used fireworks to change the tab color gradients from blue to green.

Yahoo also provides a design pattern library. This helped me configure my pagination links. Yahoo has spent a lot of time with user testing to make sure things are easy to use.

Create your own search engine

So, what are you wating for? Visit the Yahoo Developer Network and start by signing up and getting a application key. I will write separate posts that describe how to build various components of the page. I’m looking forward to hearing from better PHP programmers on how to improve the code.

Adding style to your rel attributes with CSS

View the finished example: Adding style to your rel link.

There’s a little attribute in HTML links that is starting to get a bit of attention lately. The “rel” attribute is a sparsely defined attribute that applies some meta information about a link’s relationship to other documents. Unfortunately, this information is usually hidden from your users. Let’s take a light-hearted stab at turning it into a visual element.

Rel attribute usage

While the W3C originally considered the rel attribute to describe the relationship of pages to each other, i.e. next, previous, directory, and start. The attribute has been adopted by the Microformat community for its inherit usefulness. The rel attribute is now used for tags, to define your relationship to someone, and even to tell search engines not to bother following a link.

The opportunities to use the rel attribute are seemingly endless. There are more proposals to define people you don’t like and links for voting.

But all of this flexibility comes at a small price. To remain valid, you need to tell the browser what these new rel values may actually mean. This is handled by linking to appropriate profiles. Just simply insert the profiles into your head tag. Multiple profiles may throw a validation error, but it’s ok. You don’t need to do this for the standard rel values.

We will be using the CSS3 attribute selector functionality to look at the value of the rel attribute and apply some style accordingly. First we’ll add some padding and a background image to any link that has a rel attribute. We’ll then use background positioning to display an icon that is appropriate for the link. It’s a fairly simple hack.

For more information on using attribute selectors, check out my previous posts:

Sample HTML Code

  • This link is ignored by search engines (rel="no-follow")
  • (rel="tag")
  • Sample CSS

    a[rel] {padding-left:20px; background:url(rel-sprite.png) no-repeat 0 0; }
    a[rel~="help"] {background-position: 0 -350px ;}
    a[rel~="license"] {background-position: 0 -1347px ;}
    a[rel~="no-follow"] {background-position: 0 -1200px ;}
    a[rel~="tag"] {background-position: 0 -47px ;}

    It’s all fun and games

    I’ll be the first to admit this exercise has significant issues. I’m assuming the following elements are true:

    1. All possible rel attribute values are accounted for in my CSS, if not there will be a blank space generated by the first rule
    2. You can only have one relationship defined by XFN. Unfortunately, most people are defined by multiple values, i.e. rel=”met friend colleague”. This CSS does not account for multiple values.

    So, the display of your rel attributes may be a bit off in the edge cases. Keep the spirit light and nobody will say anything… I hope. Have fun with your rel attributes. They’re just sitting there waiting to be used.

    View the finished rel attribute style example.

    Related Information

    Progressive enhancement of links using the CSS attribute selector

    Attribute Selector Test Page

    We have avoided using CSS3 rules for too long. It’s been difficult to justify using rules that won’t work for a significant portion of our audience, Internet Explorer 7 and below. However, Internet Explorer 8 is coming out soon and does work with the features we like.

    I think it’s fairly safe to assume IE7 users will upgrade to IE8 within a short time. Those stuck with IE6 for one reason or another will slowly disappear as they are given new computers or their locked down environments are upgraded.

    So, with the future of CSS3 functionality within reach, I’ve been energized to begin experimenting again. I’ll be writing a series of blog posts over the next few months that look at CSS3 functionality as a progressive enhancement. How can we continue to deliver a perfectly fine web site to IE6 and IE7 and mobile phones while enhancing the functionality of more modern browsers and devices?

    Attribute Selectors

    CSS attribute selectors are the golden ring on the web development merry-go-round. They can be daunting to learn, addictive to use, but then disappointing when you realize they are out of your grasp when you test in Internet Explorer. We can, however, begin using them to add additional functionality based on your pre-existing, semantic code. Attribute selectors give you power to write CSS that pinpoints the stuff you already code, without having to go back and add classes or ids. I’ve written previously about using attribute selectors to let your users know the language of a site they are about to visit. This trick relies on the rarely used hreflang attribute, which identifies the language of the site targeted in a link.

    There are many other attributes in your HTML, from table headers, image src, link titles, and selected options. Think about all of those juicy attributes just waiting to be targeted. Also think about how you could actually do something useful with them.

    Announce the file type of a link with CSS

    I once worked for a company that had hundreds of thousands of static HTML pages in their intranet. With no content management system; it was impossible to make global changes. The only thing they shared was a common set of style sheets. Does this sound familiar? Follow along as we increase your site’s usability in a less than perfect, but efficient way.

    First off, for accessibility, you need to let users know when a link will open a file, what type it is, and how large it is. This is best done by adding it to your HTML code:

    Foo presentation (.pdf, 5kb)

    That delivers the information to everyone, regardless of their browser. This, however takes time and is a daunting task for updating legacy code.

    We can, however, use the atttribute selector to target the extension of the link to display the icon and insert the text describing the file type. Here’s the sample HTML code:

    It’s a simple list of links for different types of files. We’ll be looking at the extensions: .zip, .pdf, .doc, .exe, .png, and .mp3. Feel free to extend this list to any extension you so desire. This would be especially helpful for a company that uses proprietary file types within their intranet.

    Now, let’s look at the CSS:

    a[href$="mp3"] {padding-left:20px; background:url(bg-file-icons.png) no-repeat 0 0;}
    a[href$="png"]{background-position: 0 -48px;}
    a[href$="pdf"] {background-position: 0 -99px;}
    a[href$="mp3"]{background-position: 0 -145px;}
    a[href$="doc"]{background-position: 0 -199px;}
    a[href$="exe"]{background-position: 0 -250px;}

    a[href$=".zip"]:after{content: "(.zip file)"; color:#999; margin-left:5px;}
    a[href$=".pdf"]:after{content: "(.pdf file)"; color:#999; margin-left:5px;}
    a[href$=".doc"]:after{content: "(.doc file)"; color:#999; margin-left:5px;}
    a[href$=".exe"]:after{content: "(.exe file)"; color:#999; margin-left:5px;}
    a[href$=".mp3"]:after{content: "(.mp3 file)"; color:#999; margin-left:5px;}
    a[href$=".png"]:after{content: "(.png file)"; color:#999; margin-left:5px;}
    a[href$=".exe"]:after{content: "(.exe file)"; color:#999; margin-left:5px;}

    See the final test page.

    Pattern matching in the attribute selector

    We have some limited “regular expression” functionality in CSS3. We can search for an attribute’s presence and match a pattern within the attribute’s value.
    Patrick Hunlon has a good summary of the pattern matching:

    • [foo] — Has an attribute named “foo”
    • [foo=”bar”] — Has an attribute named “foo” with a value of “bar” (“bar”)
    • [foo~=”bar”] — Value has the word “bar” in it somewhere (“blue bar stools”)
    • [foo^=”bar”] — Value begins with “bar” (“barstool”)
    • [foo$=”bar”] — Value ends with “bar” (“I was at the bar”)
    • [foo*=”bar”] — Value has bar somewhere (“I was looking for barstools”)

    Attach icons to anything with CSS

    The CSS is simply looking to see if the desired extension is at the end of the link href. If so, apply the following styles.

    Adding an icon to the link

    First, we are match any of the desired file extensions. We then add a background image and some padding on the left side with a bulk rule. Then the background position on the sprite is adjust for each particular link type. Combining multiple icons into one background image reduces the number of files the user has to download, making your page faster. This will work with any browser that recognizes attribute selectors, including Internet Explorer 7. However, support for more obscure attributes may be spotty.

    There’s another peculiarity with pattern matching. Some attributes are case sensitive while others are not. The href attribute is NOT case sensitive, so the above rules will also work if your image name was FOO.ZIP, foo.Zip, or

    Adding the descriptive text

    Now, we are going to add a bit of descriptive text to each link. We can’t describe the file size, but we can tell the user what type of file it is. This is using the :after(content:) functionality and is supported by Internet Explorer 8 (yeah!!!) but not Internet Explorer 7 and below (boo!!!).
    We will also adjust the color and give it a bit of spacing.

    A big step forward with a small chunk of work

    There you have it. A small chunk of CSS coding has now added substantial usability to your legacy pages. While the CSS version is not as accessible as having the data in the actual link code, it’s a significant improvement over nothing at all. Further, there’s no harmful effect on browsers that do not understand the function. You’ve added information, but haven’t taken anything away. This is a win in my book. To save some time and effort, you could just download and use this package of CSS and icons from Alexander Kaiser.

    This rather simple example of attribute selectors and pattern matching can open your eyes to many possibilities. There are a number of developers that have been expoloring this potential for the past few years. Take a look at some of these resources for more ideas and have some fun.

    Flickr Video is live

    The often discussed, semi-fabled video on Flickr feature is finally released. It’s actually pretty cool. They’ve decided not to fight Yahoo! Video or You Tube for video supremacy. Instead, they’ve limited the time length to 90 seconds and hope to build a community of shorter, more personal videos that you can mix with your photographs.

    It also includes more storage for your photographs. Here’s a sample of a video that I just posted. It’s a non-captioned capture of a train pulling into the Chemin Vert Metro stop.

    Related articles

    How to fix your K2 powered wordpress blog after upgrading to 2.5

    Did you upgrade to WordPress 2.5 and now discover a fatal error? You may see this error when you log into the admin section if you have enabled the K2 sidebar manager:
    Fatal error: Call to undefined function wp_register_sidebar_widget() in /home/.foo/bar/ on line 31.

    Brad at ChaoticTech has created a simple solution.

    Here’s why: WordPress 2.5 has a slick new dashboard that takes use of widgets to work. K2 blocks widgets when you use Sidebar Modules (which is awesome), so WordPress 2.5 can’t get to widgets.

    What this does is make it so that Widgets is disabled for everywhere so that Sidebar Modules will work, EXCEPT for the dashboard. This pretty much solves it.

    Nice and simple.
    K2 + WordPress 2.5 = Broken? I can fix that

    Visit Chaotic Tech for the php code. You’ll simply over-write the widgets-removal.php file. Thanks Brad, you’ve saved me a ton of headaches.

    Related articles