Skip to main content

"Stop Words" in full text search

Stop words Stop words are the words without significant information. You can consider these words as a filler words. If these words are used in search statement, then result could potentially contain all the record from data store. 



It is always a good idea to filter out these words before constructing the search query. It will make the search query smaller, faster and result will be more relevant.







One of our major performance optimizations for the “related questions” query is removing the top 10,000 most common English dictionary words (as determined by Google search) before submitting the query to the SQL Server 2008 full text engine. It’s shocking how little is left of most posts once you remove the top 10k English dictionary words. This helps limit and narrow the returned results, which makes the query dramatically faster.” - stackoverflow.com


English stop word list :
http://jmlr.csail.mit.edu/papers/volume5/lewis04a/a11-smart-stop-list/english.stop


Some interesting articles
http://www.codinghorror.com/blog/2008/11/stop-me-if-you-think-youve-seen-this-word-before.html
http://www.seobythesea.com/2008/08/google-stopword-patent/
http://www.googleguide.com/interpreting_queries.html




Comments

Popular posts from this blog

ERROR: Ignored call to 'alert()'. The document is sandboxed, and the 'allow-modals' keyword is not set.

Recently I found this issue while writing code snippet in "JSFiddle". And after searching, found this was happening because of new feature added in "Chrome 46+". But at the same time Chrome doesn't have support for "allow-modals" property in "sandbox" attribute.

Chromium issue for above behavior:
https://codereview.chromium.org/1126253007

To make it work you have to add "allow-scripts allow-modals" in "sandbox" attribute, and use "window.alert" instead of "alert".



<!-- Sandbox frame will execute javascript and show modal dialogs --> <iframe sandbox="allow-scripts allow-modals" src="iframe.html"> </iframe>


Feature added: Block modal dialog inside a sandboxed iframe.
Link: https://www.chromestatus.com/feature/4747009953103872

Feature working Demo page:
https://googlechrome.github.io/samples/block-modal-dialogs-sandboxed-iframe/index.html



JavaScript [ExtJs3]: EditorGridPanel Read-Only (dynamically)

Many time we face the scenerio where we have to make the editor grid read-only dynamically.


Ext.override(Ext.ux.grid.CheckColumn, { editable: true, onMouseDown: function (e, t) { if (Ext.fly(t).hasClass(this.createId())) { e.stopEvent(); var me = this, grid = me.grid, view = grid.getView(), index = view.findRowIndex(t), colindex = view.findCellIndex(t), record = grid.store.getAt(index); if (!grid.isReadOnly && grid.colModel.isCellEditable(colindex, index)) { record.set(me.dataIndex, !record.data[me.dataIndex]); } } } }); var grid = new Ext.grid.EditorGridPanel({ ... isReadOnly: true, //set to flag to make check column readonly ... }); //to make other column readonly grid.on('beforeedit', function () { return false; });

JavaScript [ExtJs3]: Total “Record” count in filtered store

There is two way to get record count from the Store
store.getTotalCount() This function depend on server response value. For accuracy of the value, property shell if return by the server.

Property name for the diff. reader:
totalProperty for JsonReader, totalRecords for XmlReaderstore.getCount() Will return you the number of record from the store.
Or if you have filter on the store, it will give you the number of filtered record.
But if you want to get the total number of record regardless filtering, Then it will be like this

var totalRecords = store.snapshot ? store.snapshot.length : store.getCount();
“snapshot” is the variable in “Store” which hold the actual data in case if you have applied a filter.