Bugs » #28
This section on the Community is no longer supported, in favour of Wikidot's Official Feedback Site.
It is retained here for archiving purposes.
Bugs
Tags
Posted by James Kanjo on 21 Mar 2009 00:51, last edited on 21 Mar 2009 04:21
This bug is open |
Description
When using Wikidot's inbuilt search function, you are unable to search for words with non-alphanumeric characters in them. This is because the search parser ignores the non-alphanumeric characters. However, commas (,) and colons (:) seem to be interpreted as spaces.
User Input | Search Interprets |
---|---|
moc.todikiw|tcatnoc#moc.todikiw|tcatnoc | contactwikidot.com |
code-101 | code101 |
code,red | code red |
code:502 | code 502 |
Strangely, however, if there is only a single character before or after the non-alphanumeric character, the non-alphanumeric character is interpreted as a space.
User Input | Search Interprets |
---|---|
f-100 | f 100 |
48-g | 48 g |
48@g | 48 g |
Why is this a problem?
This becomes a problem because it makes some keywords in wikis unsearchable. For example, if you write a page about a weapon called the AK-47, you will never be able to find the "AK-47". This is because the search translates AK-47 into AK47, and the term "AK47" is not indexed in the wiki.
Possible solutions for developers
- Transform the indexing engine into one that indexes searchable terms. That is, if AK-47 appears in an article, it is indexed as AK47 so that it can be found when searching
- Allow non-alphanumeric characters to be included in the search engine
Reported by
Rate this Bug
Rate the urgency of this bug. If you think it is more urgent and important than it's current rating suggests, rate it up.
I'm not sure that is entirely accurate. If I do a search for F-15 (attempting to find this article) the search finds F-100 Super Sabre, Vickers F B 5, Mitsubishi F-1, and even "15, 1945)," but not F-15 Eagle. I'm not sure if that helps you fix the issue, but maybe it will clarify what the problem is.
Oh thankyou for pointing that out!
It appears that the glitch is worse than first thought: If you use a non-alphanumeric character, and either preceding or following that character is a single character, the search result interprets the non-alphanumeric character as a space.
I will update this page to account for that.
See also my two posts from Dec '08 in thread New Search Engine For Wikidot and the answers from development.
Also see my post about how difficult it is to find things even if you know exactly what you're looking for.
There is some little bug, not it works totally different, if special characters were removed completely, searching for "com@munity" would be the same as searching for "community". This is untrue. The right solution is to replace more special characters (I believe some of them are already replaced) by space, so that searching for "AK-47" searches for "AK 47" and the document that has "AK-47" has the highest rank (because it is tokenized, so that "AK" and "47" tokens are indexed).
Piotr Gabryjeluk
visit my blog