whatspop - Kunal Anand

My ColdFusion Tagging Engine/Library

12/05/2005

Every other Web 2.0 application has a tagging system in place. It certainly helps when frameworks like Ruby on Rails take the grunt work out of it. That said, Justin and I were talking about how trivial it is to set up a relatively narrow tagging engine. Here is my extremely low-fidelity approach that I whipped up in a few minutes in ColdFusion.

In setting up a tagging engine, we need to clarify the mechanics of the system. The following set is loosely based on many tagging engines (including del.icio.us and Flickr):

  1. Tags are delimited by whitespace
  2. HTML tags are escaped
  3. Tags are all lower-cased
  4. Duplicate tags in the list are removed
  5. Tags are sorted alphabetically
While I feel those are good principles for tagging, you may disagree. Now that those rules out of the way, we'll start by creating a list of tags:

<cfset tags_list = "   CaMELCaSE   comma, 'singlequoted' <b>bold</b> ""doublequoted"" double double  this should be over the limit now " />

You should see some good-looking strings, some HTML, and a duplicate value. Now it's our job to sanitize all of that into an acceptable format for data storage and browser display. Don't worry, it's not too much extra work.

We begin our code by defining a few constraints. Even though these are completely optional, I think it is good to prevent the user from entering lots of very long tags. The amount variable controls the number of tags and the length variable controls the maximum number of characters for a tag.

<cfset amount = 10 />
<cfset length = 20 />

Remember that rather strange tag list from above? It is time to normalize it using some of the loosely defined engine rules that at the start of this post.

<cfset tags_list = trim(tags_list) />
<cfset tags_list = lcase(tags_list) />

The first line trims all the preceding and trailing spaces. The second line changes the case for all the tags. Now we have to get rid of those duplicate tags (remember "double double" from the list above?):

<cfset tags_struct = structnew() />

<cfloop index="tag" list="#tags_list#" delimiters=" ">
<cfset tags_struct[tag] = "" />
</cfloop>

<cfset tags_list = structkeylist(tags_struct," ") />
<cfset tags_list = listsort(tags_list, "textnocase", "asc", " ") />

For those that are interested, I decided to use structure keys for automatic de-duplication. Another possible way to de-duplicate the list is to instantiate another list variable, for temporary storage, and perform a nested loop routine that utilized the "listcontains" function. For my purposes, the method above is simple and involves the creation of a flat data structure. After the conversion back to a list, we sort the tags alphabetically. Now we move onto the end:

<cfif tags_list neq "">
<cfif listlen(tags_list," ") lte amount>
<cfset amount = listlen(tags_list," ") />
</cfif>
<cfloop from="1" to="#amount#" index="tag">
*** Perform a database upload here ***
<cfoutput>#htmleditformat(left(listgetat(tags_list,tag," "),length))#</cfoutput>
</cfloop>
<cfelse>
Sorry, you need to provide at least one tag
</cfif>

If the list does not hold a value, report an error. If it does, do an upload for each tag (cfquery) using the provided syntax to get at the current tag in the loop. Also, you can see how both the "amount" and "length" variables helped alter this process. Note, I prefer to take care of my HTML sanitation here (this is out of habit). You could choose to deal with it when you are normalizing your tags. Finally, this is just a quick proof of concept - I have probably overlooked some security quirks and general ColdFusion issues, such as the pound sign.

Go play with the code! Incorporate it into some of your projects. If you want to help and extend this, shoot me an email. I have a couple of ideas on how to make this more secure and flexible.

About I am currently a Senior Engineer at MySpace. Feel free to check out my personal collective.

Archives
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
December 2005
January 2006
March 2006
April 2006
May 2006
June 2006
July 2006
August 2006
September 2006
October 2006
November 2006
January 2007
February 2007
April 2007
November 2007
December 2007
January 2008
March 2008
April 2008
May 2008
June 2008

Subscribe to my feed