• Skip to primary navigation
  • Skip to main content

Michael Cizmar

Founder of an award winning consultancy

  • Open Source Projects
  • About Michael
  • Show Search
Hide Search
You are here: Home / Home

What you need to know about The Forrester Wave™: Cognitive Search And Knowledge Discovery Solutions, Q2 2017

michaelcizmar · Jun 13, 2017 · Leave a Comment

It has been about 18 months sense Forrester released it’s Cognitive Search and Knowledge Discovery Solutions.  In general, Forrester reviewed the same companies.  You can see a snapshot on a slideshare that we posted at MC+A here.  Google’s out as is Lexmark.  Replaced with Elastic and Squirro.  Here are the big take aways from the update:

Keyword search is dead

With the ability to spin up elastic in a variety of on premise and cloud version, no one is looking to heavily invest in a commercial solution that simply provides keyword search.  Pretty much everyone has begun to stop using the term “enterprise search”.  With this I agree. 
Cognitive Search is about: Understanding user’s intent and effecting outcomes.  Last week I gave a webinar with Mark Floisand, Chief Marketing Officer of Coveo.  We discussed this topic, the general shift towards insight engines.  (If you goto the MC+A web site you can find the recording and slides)

The report is big on relevancy

Relevancy is key.  But I don’t think the report does well to define how people are measuring it.  Enterprise Search has moved beyond providing a list of results.  These are now insight engines that power assist features, bots and many inline contextual experiences.  
Defining what is relevant is not simply done based on a user providing a few keywords.  It done looking at a variety of contextual data points sometimes without keywords.

The report establishes the general landscape

In general, companies who are considering a GSA re-platform or implementation of a cognitive solution can short list just the companies that are on this list.  These are generally well run companies each with an interesting approach to solving the problems faced by end users.  Each has strengths and philosophies that should become clear when you look briefly at their products.

The report incorrectly focuses on vague lost time as a key driver

In 2017, it’s surprising to see the general reference of abstracted wasted time by an employee as a key driver in purchasing.  In general, I haven’t sold a search solution to an organization because they thought their workers were being interrupted a few times per month to look up information.  The key driver for this technology is injecting it into specific processes and workflows which then benefit from the assisting function.

The scoring matrix seems odd

I like that Forrester provides it’s scoring matrix.  It allows us to understand how each vendor did.  I did notice some interesting categories and scoring.  Some observations that I thought odd:

Market Presence weighted at Zero

The market presence seems to be weighted at Zero.  I am not sure if that’s a typo or not.  Market Presence is not as important as other categories but it should probably be greater than 0.

Implementation Support is under Strategy

Not sure why implementation support is under strategy.  FWIW, MC+A implements many of these technologies.

Equal weighting of Current Offering to Strategy

No one should be making a purchase strictly on the application as it is.  There is a significant investment and return to this technology.  I don’t think that these therefore are equal weighted.  I think an additional category or entry should be in the effectiveness of the products AI out of the box.
Questions?  Drop a comment or send me a note at MC+A.  We’d be happy to discuss your use case.

Lucidworks Fusion index pipeline stage to transform Google Search Appliance feed XML

michaelcizmar · May 18, 2017 · Leave a Comment

I’ve been encountering clients who have developed various processes to ingest content to their Google Search Appliance which involves creating GSA feed xml.  Some of these client’s we’ve been replatforming them to Lucidworks Fusion.  Fusion provides components that allow for the ingestion of this content easily but they do require some configuration.

Step 1 – You need a connector

Fusion comes with connectors out of the box.  There are two likely datasources which you would like to use: The Local File System or the Push Content (link).  The Push Content  simply creates an end point which will receive the content and place it into a defined indexing pipeline.  For my demo, I used the Local File System.

Step 2 – Creating an indexing pipeline

There is some documentation on how to ingest xml documents on the fusion documentation site (link).  I choose this pipeline:

  1. Tika Transformation
    1. Return Parsed Content as XML or HTML [X]
    2. Return Original XML and HTML Instead of Tika XML Output [X]
  2. Custom Javascript Stage (See code below)
  3. Field Mapping (Default)
  4. Solr Dynamic Field Name Mapping (Default)
  5. Solr Indexer (Default)
The power her is the Nashorn engine that lies underneath.  This allows you to use all of the underlying JDK classes and the javascript will ultimately be ‘compiled’ into java byte code.  With this, I was able to use some simple classes to parse the xml and decode the base64 encoding that some feed items contained.  The basic function was derived from Lucidworks example.

Failures to Launch in Cognitive Search

michaelcizmar · Apr 28, 2017 · Leave a Comment

Natural Language Processing is one of the cornerstones of Cognitive Search.  It is what translates your stream of text into some understanding of intent and action.  I’m assembling a collection of good mosques to remind us that we still have a way to go for a richer Human Computer Interaction.

Question: Siri – “Is the moon made out of cheese?”

Answer: Some information bout a Japanese adult visual novel.

While Siri was able to translate my speech to text, “she” failed to interpret the intent of my text.

Question: Google – “How much does a pound of water weigh?”

Answer: Apparently 8.34 pounds?

Question: What was the Cubs Score Sunday?

The key part of the phrase is “was”.  Bing gets it right, Google tells me the upcoming schedule.

Top 5 insights from Gartner’s Magic Quadrant for Insight Engines

michaelcizmar · Apr 18, 2017 · Leave a Comment

Gartner has migrated the “Gartner Magic Quadrant for Enterprise Search”  to the “Magic Quadrant for Insight Engines”.  This migration follows a trend set by Google who announced the effectively its exit to the enterprise search space later to reveal a insight engine embedded within gSuite called Cloud Search.  Gartner describes an Insight Engine as:

“Insight engines provide more-natural access to information for knowledge works and other constituents in ways that enterprise search has not.”

Which is somewhat false because insight engines are typically built on a similar technology stack that enterprise search is built on.  Regardless of the name change some fundamental changes in the market exist and some key insights from this in my few are:

1 – We are moving beyond keyword Search

At MC+A we made some predictions for 2017 for the cognitive area in an insight article published earlier this year.  I have heard a few times in the past couple of months that the go forward strategy is that the CMS has keyword search and therefore there isn’t a need for an insight engine.  
A keyword based search index makes retrieving a known document fairly easily and efficient.  If you do not know the combination or issue the wrong keyword then you are not going to be successful.  Contrary to this, an insight engine takes some many more scores beyond keyword matches.

2 – Your CMS having a Lucene based index is not going to cut it

As noted in numerous articles, CMS’ are designed for the people who buy them and not for the people who use them.  The search feature is not something that is typically tested properly because most searches are in the long tail and are hard to fully understand.  A simple reverse index of keywords matched with some stems is not the experience that consumers of your system want.

3 – Natural Language Processing is the gateway to real time interactions

Search and Assist can take cues from navigation paths, signals processing and other ML techniques.  It’s not going to be as useful as someone telling you exactly what they are looking for.

4 – We’re going to need a bigger boat

Machine Learning and AI algorithms need lots of computing cycles.  Disks are slow even if they are SDD and memory is fast.  Analytics and Signals are generated at an exponential rate to search queries.  Additionally indexing your content with NLP can be quite consuming.  (Cloud services and SAAS offerings obfuscate this from consumers.)

5 – When it’s all said and done, there will still be something that looks like a searchbox

Count the number of people outside of an Apple commercial who use Siri and then think how awful the office will become is people repeating endlessly their voice requests.  As far as Human Computer Interfaces go, the search box is here to stay near term.   Searchengineland predicted its death last year.
If you read the article, you’ll note that most of the searching is still done through a box although with an advance assist feature which allows for autocomplete.

Google rolls Springboard to Cloud Search

michaelcizmar · Feb 9, 2017 · Leave a Comment


In June, Google announced a more power connected platform for search within Google Apps.  This was the long awaited announcement for its post Google Search Appliance product.  An early adaptor program and announcement at BoxWorks were the only public acknowledgement since that announcement.  This week, Google has finally made the product generally available.

Described as using”Machine Learning”, this product sets the stage for a future direction of search within Google Apps.  Users of Google Now are going to be unimpressed with the current knowledge and assist cards.  The big feature is that you do get search across your Google Apps content (Mail, Drive, Sites and Calendar).  Something that heavy users of GSuite have been wanting for quite some time.


I am still waiting for an iOS version of the app but here are some of its key features:


Assist Cards



Similar to Google Now cards, Cloud Search has a few knowledge cards to assist you through out the day.  Cloud Search will show you upcoming meetings, recently viewed documents and suggested documents based on other actions.  Additional cards will be added over time.






Company Directory

Searches can bring up directory information.  See below:


Security

Cloud Search honors the sharing permissions for files and other resources.  This means you only see results for which you have access to.

Roll Out

Cloud Search will roll out over the next few weeks to GSuite Business and Enterprise Customers.  The Early Adaptors will have access through March.

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Go to page 5
  • Go to page 6
  • Go to page 7
  • Go to Next Page »

Copyright © 2019 · Monochrome Pro on Genesis Framework · WordPress · Log in