EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 12 Apr 2016, 04:54 AM   #1
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Search problem with "Wedding" (bad stemming?)

I noticed an interesting problem when searching for messages containing the word "wedding" — using either the web interface or an IMAP search. It seems that I got a LOT of hits, and on further investigation, it looks like every message sent to me on a Wednesday gets included.

I suspect it's an unintended stemming problem, but obviously an inconvenient one in this case. In much the same way a search for "bus" matches "buses" (and vice-versa), a search for "wedding" is going to match "wed" since of course that's another form of the word. The problem, of course, is that's also the abbreviation for "Wednesday"

The only way I could find to work around this was to use the "substr" directive to search for the exact word, but of course this only works in the web interface, and not when searching from an IMAP client such as Apple Mail.
jhollington is offline   Reply With Quote

Old 12 Apr 2016, 05:11 AM   #2
janusz
The "e" in e-mail
 
Join Date: Feb 2006
Location: EU
Posts: 4,945
Try wedding NOT Wed Works for me in the classic interface,
janusz is offline   Reply With Quote
Old 13 Apr 2016, 12:46 AM   #3
ChinaLamb
The "e" in e-mail
 
Join Date: Dec 2004
Location: a virtually impossible but finitely improbable position
Posts: 2,320
Quote:
Originally Posted by jhollington View Post
I noticed an interesting problem when searching for messages containing the word "wedding" — using either the web interface or an IMAP search. It seems that I got a LOT of hits, and on further investigation, it looks like every message sent to me on a Wednesday gets included.

I suspect it's an unintended stemming problem, but obviously an inconvenient one in this case. In much the same way a search for "bus" matches "buses" (and vice-versa), a search for "wedding" is going to match "wed" since of course that's another form of the word. The problem, of course, is that's also the abbreviation for "Wednesday"

The only way I could find to work around this was to use the "substr" directive to search for the exact word, but of course this only works in the web interface, and not when searching from an IMAP client such as Apple Mail.
Haha... yeah. Wedding turns up results for all "Wed"

/cl
ChinaLamb is offline   Reply With Quote
Old 13 Apr 2016, 08:19 AM   #4
David
Ultimate Contributor
 
Join Date: Dec 2001
Location: Canada.
Posts: 10,355
Would it not also turn up results for 'We' .......
David is offline   Reply With Quote
Old 13 Apr 2016, 09:06 AM   #5
robn
Master of the @
 
Join Date: May 2012
Location: Melbourne, Australia
Posts: 1,007

Representative of:
Fastmail.fm
We're currently using Xapian's stock English stemmer, also known as "porter2", which is considered the "standard" English stemmer for general use.

(Here's an intro to stemming for those interested).

A standard IMAP SEARCH command (which most clients use) will not use the search index, but instead just do regular substring searches. These are slow, but sometimes more precise. You can get this behaviour in the web interface using substr: or even imap: (more info in the search docs).

Clients can the IMAP SEARCH=FUZZY extension if they want to use the search index, which will give largely the same results as the web client (but not across folders; that's a FastMail extension). So if you're getting the same results in a client, that's probably what's going on (I don't know what Apple Mail does myself. I can test if you're interested).

Since iOS 6 the iOS Mail app issues regular (non-indexed) searches, but does them across all folders at once, so ends up really hurting the server. To get around that, when we detect a search from iOS Mail, we automatically enable SEARCH=FUZZY to ensure good performance.

Hopefully that explain it all.
robn is offline   Reply With Quote
Old 16 Apr 2016, 12:31 AM   #6
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by janusz View Post
Try wedding NOT Wed Works for me in the classic interface,
Ah, of course. Great tip, thanks. Seems to do something different with an iOS-based IMAP search, but I haven't quite figured out what yet.... Looks like it might actually break something, in fact, as I get one result, and then the search just kind of sits there, not quite complete. Either way, going back to the Classic interface of the FastMail iOS app isn't a big problem for times when I need to do more complicated searches.

Quote:
Originally Posted by David View Post
Would it not also turn up results for 'We' .......
I'd guess not, since "We" isn't a short or alternative form of "Wedding" .... Stemming, as I understand it, is for word constructs, not merely shorter versions..... "Wedding", although commonly used these days a noun, I guess would technically be the action version of the verb "to wed"

It's just unfortunate that it's also the short form for Wednesday. There's probably a message there somewhere, but I'm not sure exactly what
jhollington is offline   Reply With Quote
Old 16 Apr 2016, 12:42 AM   #7
jhollington
Essential Contributor
 
Join Date: Apr 2008
Posts: 371
Quote:
Originally Posted by robn View Post
Clients can the IMAP SEARCH=FUZZY extension if they want to use the search index, which will give largely the same results as the web client (but not across folders; that's a FastMail extension). So if you're getting the same results in a client, that's probably what's going on (I don't know what Apple Mail does myself. I can test if you're interested).
Thanks for the explanation, that's actually quite interesting, and refreshed my memory about some of the search stuff I read on the blog a while back.

Apple Mail (the OS X app) isn't really impacted by this for me as I don't do IMAP searches from there — I've got all of my mail synced to my Mac, so it's using its own Spotlight system at that point, with its own special variety of twisted logic

Quote:
Since iOS 6 the iOS Mail app issues regular (non-indexed) searches, but does them across all folders at once, so ends up really hurting the server. To get around that, when we detect a search from iOS Mail, we automatically enable SEARCH=FUZZY to ensure good performance.
That would be the part I remember reading, and it was the point where search in iOS Mail actually starting working with FastMail — working seriously well in fact, as it was abysmally slow before, almost to the point of not even bothering to try.

That said, I did run into something interesting.... An Apple Mail IMAP search that uses "NOT", as janusz described above, seems to work properly but also takes a significantly longer amount of time to run — on the order of several minutes (to give you an idea, the search was still happening when I finished my last reply, and only spit out the results as I was about half way through this one. I got one result the first time, and the search bar in IOS Mail skimmed across to about 90% and then sat there for another 3-5 minutes before finishing and displaying the rest of the results. The second time I ran the same search, I got more results (all of them as far as I can tell), but the search bar did the same thing.
jhollington is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 12:13 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy