Based on Google’s announcement, Springboard should be entering general availability (Although most likely still in “beta”). Not having being asked what should be in the product, I’ve put together my own short wish list. Obviously this is a much much larger topic then can be described in a short blog post. Nonetheless, here are a couple of thoughts:
1 – What should we use for our connector framework?
You should migrate from the existing Plexi Framework for a Microservices based architecture based on Spring or pure Google Cloud Platform. Spring runs everywhere both on prem and in the cloud. Microservices allow for a more pluggable / dependency injection model for traversal and processing of data to the cached store and the native Cloud Platform is excellent and processing larges amounts of data.
Some reasons to not use the current Plexi framework is that it does not use dependency injection and therefore requires complicated re-architecture / recoding for changes. Further, crawling is slow and requires many http connections. Indexing things in batch and streams is much more efficient. The connectors should sync data to a temporal storage system and use something like Cloud DataFlow for stream and event based processing.
2 – Where should that temporal storage system be then?
Google Cloud Storage should be the primary data source for external content ingestion. The api is documented and performs very well under large data loads. Connectors sync data to a variety of buckets. As the system is updated or potentially requires reindexing, change event notifications can be wired to auto update (link). The storage is encrypted at rest and provides low latency access from other GCP regions.
3 – How should we process the data
Data stored in Google Cloud Storage and be synced into the Springboard index via Cloud Pub / Sub and Cloud Data Flow. Standard processing templates can be provided and users should be allowed to upload their own pipelines which leverage other Machine Learning APIs both in GCP and other areas.
4 – How should we display the result set?
I have a core belief that search should not be an “opt-in” experience. What I mean by this is that you should not have to goto “springboard.google.com” but rather you want to take into the applications that you want to expose it contextually. Search is simply a service that can be extended through whatever framework you like.
5 – What other things should we think about?
Don’t make search opt in
Integration with Other Google APIs
Integration with Google Cloud
Currently Springboard seems to be integrated deeply within G Suite. This potentially limits its use case in the way that Hangouts and Google Docs does. Everything is tied to a G Suite user account and the G Suite underlying infrastructure. Google Cloud on the other hand already has exposed services and infrastructure which widen the use cases for which Springboard can be leveraged.