Use RedCloth to nofollow all user generated links

This post was migrated over from Muziboo DevBlog. Nofollow attribute in link tag makes sure that search engine bots don't use the link for crawling the destination site. This is a measure to combat link spam and to make sure that you don't pass link juice if you don't want to. This is especially important for user generated content sites as they have no control over what links the users will put. At Muziboo, we use RedCloth to format the text that users input. We also use it to filter out any css styling the users put in and some html tags like embed, img etc. Let me show you how we nofollow all links automatically. Create a file called redcloth extension.rb and put it in config/initializers directory. Now put this in the file (please replace a_tag below with a. I had to stop wp from making that line a link)
module RedCloth::Formatters::HTML
  include RedCloth::Formatters::Base

    def link(opts)
     "#{opts[:name]}"

    end

    private

    # HTML cleansing stuff
    ALLOWED_TAGS = {
      'a' => ['href', 'title'],
      'br' => [],
      'i' => nil,
      'u' => nil,
      'b' => nil,
      'pre' => nil,
      'kbd' => nil,
      'code' => ['lang'],
      'cite' => nil,
      'strong' => nil,
      'em' => nil,
      'ins' => nil,
      'sup' => nil,
      'sub' => nil,
      'del' => nil,
      'table' => nil,
      'tr' => nil,
      'td' => ['colspan', 'rowspan'],
      'th' => nil,
      'ol' => ['start'],
      'ul' => nil,
      'li' => nil,
      'p' => nil,
  }

    def clean_html( text, tags = ALLOWED_TAGS )
        text.gsub!( /]*)>/ ) do
            raw = $~
            tag = raw[2].downcase
            if tags.has_key? tag
                pcs = [tag]
                pcs << "rel=\"nofollow\"" if (tag == 'a' and raw[1]=='')
                tags[tag].each do |prop|
                    ['"', "'", ''].each do |q|
                        q2 = ( q != '' ? q : '\s' )
                        if raw[3] =~ /#{prop}\s*=\s*#{q}([^#{q2}]+)#{q}/i
                            attrv = $1
                            next if prop == 'src' and attrv =~ %r{^(?!http)\w+:}
                            pcs << "#{prop}=\"#{$1.gsub('"', '\\"')}\"" unless $1.nil?
                            break
                        end
                    end
                end if tags[tag]
                "<#{raw[1]}#{pcs.join " "}>"
            else
                " "
            end
        end
    end

end
To filter the input, simply call RedCloth.new(input_text,[:sanitize_html,:filter_styles]). These instructions are for Rails 2.3.2. For older rails' put the redcloth_extensions.rb file in lib folder and require it in your environment.rb


blog comments powered by Disqus
Hana Mohan
Hana Mohan