QUAL SERÁ O PRÓXIMO GOOGLE? Mudança para a Era de Conhecimento e Busca Exploratória WHO’S THE NEXT GOOGLE? Shift to Knowledge Age And Exploratory Search

Anais do 9o CIDI e 9o CONGIC Luciane Maria Fadel, Carla Spinillo, Anderson Horta, Cristina Portugal (orgs.) Sociedade Brasileira de Design da Informação – SBDI Belo Horizonte | Brasil | 2019 ISBN 978-85-212-1728-2 Proceedings of the 9th CIDI and 9th CONGIC Luciane Maria Fadel, Carla Spinillo, Anderson Horta, Cristina Portugal (orgs.) Sociedade Brasileira de Design da Informação – SBDI Belo Horizonte | Brazil | 2019 ISBN 978-85-212-1728-2 QUAL SERÁ O PRÓXIMO GOOGLE? Mudança para a Era de Conhecimento e Busca Exploratória WHO’S THE NEXT GOOGLE? Shift to Knowledge Age And Exploratory Search


Introduction
With the development of the Internet, search engines, such as Google, have become indispensable in everyday life. For most of people who are using internet as part of their daily life (McMillan & Morrison, 2006), these tools have long been a vital means of discovering information and keeping up-to-date with social-economic trends (Madon, 2000).
Search Engines has indeed brought enormous convenience; however, their limitations have also started to reveal in some cases (Butler, 2000). As the current search tool is based on the approach of 'Keyword' searching, for users who search for the purpose of learning some knowledge in a domain, it would appear clumsy and inconvenient. Imagine the case of doing a study related to "computer vision" but not being familiar with the term and domain. Seeking relevant information through Google-like search engine would be a painful process; you would need to filter through an illogical list of search results to build up systematic knowledge of the 1705 topic. After choosing a direction of interest, you would then need to refine keywords repeatedly to obtain optimal search results. The search engine would even be impossible to help access the broad knowledge field, if you don't know the key technical term "computer vision"-perhaps why search queries are called "keywords". This limitation reflects users' growth of information need. As the classic saying "Thanking Wikipedia and Google for our degrees", users' expectation of what they can obtain from Internet has increased from finding answer of single query to acquire the broad knowledge of a certain field. A report by the International Education Advisory Board in 2008 noted that Millennials, the current generation of students born between 1980 and 2000, tend to be more open to Internet learning; they make the use of internet searches for information and learn new subjects by clicking through hyperlinks from their original searches (Bickham, Bradburn, Edwards, Fallon, Luke, Mossman & Ness, 2008). As suggested by information foraging theory, search tools should evolve to adapt to this growing user demands so as to successfully offer the most profitable way of information retrieval.
While rethink our searching behaviour, it is actually a painful compromise of human ability and technology limitation. As James Whittaker, Distinguished Engineer and Technical Evangelist at Microsoft, pointed out, "search" is a negative word which "fairly reeks of loss and effort". When individuals conduct a search, they have lost something at the moment when they need it. The process of search is also burdensome, as people have to identify channels to access and process information that may be relevant before finding what they need. For analogy, imagine the case of searching door keys. The time of starting the search would only happen when you need the key, and it won't be pleasant to go through all the possible stuff for finding it, let alone to search information over the Internet, when the scope and capacity have been hugely expanded.
To effectively achieve the objective of information acquiring, people have developed a strategy of being organized and forcing memory recall, which is, of course, not sustainable for the limited human capacity. With an expectation on technology, an ideal information retrieval scenario would be an instance in which information comes to users directly when they need it rather than by performing certain actions to find it. However, this requires necessary input of human data for the correct prediction. The coming of Big Data and IoT has made it ready to predict users' specific information need , e.g. accessing calendar information to auto-push travel route plan. However, for the previous case of searching for learning, it is still challengeable to be resolved through prediction and auto-pushing, as it requires input from users' current knowledge system and their real-time learning process .
If not through auto-pushing, then how should search tools better adapt to the need of learning in the near future? Compare the experience in acquiring knowledge from search engines, a typical successful consultation with a human expert is similar yet different. The expert explores individuals' queries based on his or her knowledge systems and delivers answers with context. After some confirmation/discussion with the individual, the consultant can retrieve closely related information the person needs and may even inspire or remind the individual of related areas that would have otherwise remained ignored . This paper firstly examines the growth of users' information needs following the development from Tabulating Era, Programming Era to now Cognitive Era. It features from Own-Item Search to locate data, Known-Item Search to find information, to now Exploratory Search to learn knowledge and potential Exhaustive Search to master wisdom in the future. It then stands from the point of information foraging theory to understand how a successful search tool should adapt to users' growing information needs, which explains the success of Google's PageRank algorithm in Programming Era and potential falling in the coming Cognitive Era. After the above understanding of the technology development and users' growing information need, the paper proposes 4 main points the next search tool should address through examining the Information Search Process (ISP), which was developed by Carol Kuhlthau in 1991, and discussed some potential solutions.

1706
Therefore, technology capacity improved dramatically upon entering the cognitive era with artificial intelligence, which should drive changes in search engines to better fulfil users' information needs. The changes will likely address the following features:

▪
The pain of known-item searches will be resolved, such that information will be pushed to users automatically when they need it; and ▪ Exploratory searches will be facilitated by search tools to help users acquire knowledge more effectively through the search process.

The trend of Information Need
This section discusses four different information needs and their correspondent search modes. Through historical analysis and tendency forecast, it reveals the growth of information needs over time, including the prediction of how technology might evolve in the future to meet users' growing expectations.

Information Needs and DIKW Pyramid
Users' retrieval behaviour is driven by their information needs, and different needs affect how individuals perform a search (Rosenfeld, Morville & Nielsen, 2002). Known-Item Search and Exploratory Search are widely recognized and adopted two search modes. From a more complete perspective, I conclude that there are totally four kinds of search modes apply to different information needs:

1.
Own-item search refers to locating/re-finding an item the user already has been familiar with in mind; 2.
Known-item search refers to either finding information the user has already heard or using Q&A (i.e., looking up the answer to a single question); 3.
Exploratory search refers to learning something within a specific domain; and 4.
Exhaustive search refers to mastering all resources related to a specific topic.
The difference between these four information needs does not only concern the volume of information as shown in Figure 4; rather, the level of information in terms of hierarchy and function the users are searching in different modes is distinct, as shown by the datainformation-knowledge-wisdom (DIKW) pyramid model.
The DIKW pyramid, which is widely recognized in library and information science, is a 'takenfor-granted' model illustrating the relationship between data, information, knowledge, and wisdom in structure and function (Rowley, 2007). To some extent, it also perfectly illuminates users' deep searching goals in different search modes.

Own-Item Search to Locate/Refind Data
Data are symbols, representations of objects or event properties (Ackoff, 1989), with the purpose of knowing nothing (Zeleny, 1987). When performing an own-item search, in which users need something again and want to refind it (with the prerequisite that they already know and are familiar with the context of what they are trying to find), they simply want to recall written words or a detailed description. For instance, users may wish to recall the figure that represents Microsoft's profit during 2017 from a financial report they read before. Ultimately, these users are looking for information that corresponds with the concept of data, which is a single element and of no use if not in a certain form/context.

Known-Item Search to Find Information
Information is distinct from data given its usefulness, which answers who, what, where, how, and when . In a known-item search, rather than obtaining an instant word/signal for users' queries, the uncertainty of the result makes users pay attention to more details of what they have found. For an example, consider a user who wants to identify IBM revenues in 2017 using the internet. He/she would look into a mass of information from search results to ensure the number precisely represents the IBM revenues in 2017 (e.g., not gross profit, not revenue from other companies, and not revenue earned in other years). The complete information related to search queries is the goal of a known-item search.

Exploratory Search to Learn Knowledge
Knowledge answers "how" questions by applying data and information (Ackoff, 1989). It is selfevident that knowledge coincides with the goal of an exploratory search (which implies users' needs for domain discovery, knowledge increasing, new-topic learning, etc. (Palagi, Gandon, Giboin & Troncy, 2017)), in that users want to learn through the searching process, so they collect a few good pieces from the results to form their understanding of a certain area.

Exhaustive Search to Master Wisdom
Wisdom involves knowing why 15 and evaluating one's understanding of a field 14 . If a user does an exhaustive search (i.e., he/she wants to see every single result related to a given topic), then he/she may be researching for a doctoral dissertation, conducting competitive intelligence analysis, 12 or otherwise trying to master a domain with critical comprehension.

Growing Needs with Time
The combination of the DIKW pyramid and search modes is extremely helpful to better understanding about users' information needs deeply and refinedly. Furthermore, when examining these needs based on time, changes in machine-processed content urges the evolution of users' information needs.
As referenced in a report from Dr. John E. Kelly III, Senior Vice President in IBM, the history of computing can be divided into three eras (Kelly, 2015): 1. tabulating era: from the 1900s to 1940s; 2. programming era: from the 1950s to present; and 3. cognitive era: from the 2010s to present.
In each of these eras, the things that machines could process tend to be different, which also drives the change of users' searching needs. In the tabulating era, data had not been dematerialized and were instead stored on a physically punched card. Humans' retrieval behaviour was hence limited by the physical card, similar to an own-item search. The goal of searches during this era mainly concerned locating an item and obtaining data users already owned. The Programming era transformed information into digits, which are no longer attached to a physical container, people can look for any information they know or have heard rather than only what they own; hence, the known-item search has been empowered and was the main focus of search engines during that time.
The knowledgeable machine during current Cognitive Era has also increased individuals' expectations of learning from/with it. And search engines, as the entrance to vast Internet content, has got the most pressure and attracted great expectations regarding the adoption of AI capacity, which could enable people to gain knowledge through Internet. It is also the main starting point of this paper for rethinking our search tools.
The future after Cognitive Era is still uncertain, but machines may help humans make right decisions and go in the right direction (Wang, 2013), especially by conducting exhaustive searches to master the wisdom. Based on previous trends, however, human enhancement technologies (HET) would be the most feasible potential breakthrough to drive next era. The tech was expected for the possibilities of enhancing memory, enabling multi-dimensional thinking, supporting in-built machine thinking, etc. (Cochrane, 2014) At that time, machine is human and human is machine, through which the machine could integrate its capability to boost human's wisdom.

The Evolution of Search Engines
Though Google has remained a monopoly since the 2010s, the same will not necessarily hold true over subsequent decades. Essentially, a search engine is simply a tool to meet users' information needs, which should evolve with users and technologies. The information foraging theory provides argument from human information behaviour of how should search tools adpat to live up with users' expectations.

Information Foraging
By examining the hunting behavior of animals, anthropologists and ecologists proposed optimal foraging theory in the 1970s, reflecting the strategy in which animals tend to accept the smallest cost for the biggest benefit (energy) when searching for food. The similarities between this pattern and human behavior when searching for information led Peter Pirolli and Stuart Card to develop information foraging theory in the 1990s.
information batches: research about the environment that has been filled with information clusters; 2.
information scent: focusing on cues that could reveal the value of information; and 3.
information diet: addressing how humans select and pursue information items. Derived from optimal foraging, the information diet model describes people's tendency to select information sources based on profitability (i.e., the amount of value can be obtained for every unit spent processing the resources) (Pirolli & Card, 1999). For an example from optimal foraging, it is not always the biggest prey that a hunter would chase after because the evaluation criteria would also be based on whether the prey is easy to catch.

PageRank and Google's Success
Google was the first to realize the information diet and apply it to optimize search results ranking through PageRank algorithm, which explains its popularity and success. With the mission to "organize the world's information and make it universally accessible and useful," Google has adopted PageRank for search result ranking rather than sorting results by keyword frequency on one page, as most other search engines did at that time.
PageRank, named after one of the founders of Google, Larry Page, establishes a web page importance hierarchy by making use of the link structure of the World Wide Web (Page, Brin, Motwani & Winograd, 1999). In simple terms, if webpage A has been hyper-link referenced by webpage B, then page A will be considered as having been endorsed by page B and will receive a weighted mark. The weight will depend on the importance of page B. This way, all web content can be ranked fairly by importance using this link structure. Exploratória PageRank offers a revolutionary way to organize search results by their real values. Compared with traditional methods (i.e., ranking the results by the number of times the search terms appear on a page), PageRank is much more accurate in helping users gain valuable results to fulfill their information needs. As a result, after users enter a search query into the search box, Google can conserve users' processing effort by displaying the most valuable answers at the top, which makes Google the most profitable way for Known-Item Search.

Methodology
The question of what a new search engine would look like in this era naturally follows after understanding the shift of information need from Known-Item Search to Exploratory Search. By analysing the information search process (ISP), this section discusses the unique behaviour (or possible pain points) of using current search engines for exploratory searches, which mainly turns up in 3 main stages of the whole process that involves the action of performing a search: pre-focus exploration, focus formation and information collection. The problems encountered during each stage have contributed to the possible four points that a new search engine in this era would focus on, including keyword input, navigate output results, interact with search results and digest collected information.

What is Information Search Process (ISP)?
The information search process, which Carol Kuhlthau proposed and developed in 1991, is a holistic framework in library and information science that examines the information-seeking process through six stages: task initiation, topic selection, pre-focus exploration, focus formation, information collection, and search closure.
Kuhlthau studied those conducting research or resolving problems encountered during work (Kuhlthau, 1993), namely those seeking to acquire related knowledge through a search process to resolve research or work problems. This model was initially applied in library services and has been recommended for adoption in information retrieval systems to cultivate a more usercentred environment (Kuhlthau, 1999). It shares target users with exploratory searches (i.e., those who aim to acquire knowledge in certain domains through information seeking), which makes it a perfect model to analyse users' behaviour when using Google-like searching tools for various purposes.
The involvement of search engine starts in the 3 rd stage (pre-focus exploration), when users have an idea of a general topic and try to find a possible focus. They may still struggle to describe precisely the information they need, and they will likely feel confused and uncertain (Kuhlthau, 2005). These emotions are often exacerbated by current search tools, primarily due to two factors:

1.
current search engines only start searching by keyword queries. When a user cannot formulate thoughts into related words, search engines are unhelpful in reaching information goals. Users can only retype searches repeatedly to test whether search results suit their needs.

2.
The search engine result page (SERP) is ranked by the relevance to the user's search terms, which brings every web page out of the context. Reading through the results can make it difficult for a user to construct the skeleton, especially when entering an unfamiliar domain. The 4 th stage (focus formulation) is when users conclude and form a focus of their search based on information they have received (Kuhlthau, 2005). During this time, the search is no longer a query-answer approach but rather becomes an iteration and interaction, in that the information users find alters or clears the focus and direction of their search (Morville & Callender, 2010). In contrast, the search results nowadays are still presented in the form of a list, with no interactive aspect that could fit users' mindset.
The 5 th stage (information collection) is when users organize information and identify their next steps (Kuhlthau, 2005). This is when they digest the information collected and identify its context, which is the ultimate goal of the entire exploratory search process (i.e., knowledge acquisition) but can prove highly challenging for a domain novice. However, query-response search engines have no consideration for this, that leave users with massive unstructured information they collected.

Four Main Features of Exploratory Search
By analysing user behaviour during exploratory search through the information search model, features of the new search engine in this era become relatively clear; they possibly consist of four points: input, interaction, output, and digestion.
A considerate exploratory search experience design should consider these four points to help users better fulfill their information needs.

1.
Input: As users conducting exploratory searches are mostly domain novices in their areas of interest and thus uncertain about which channels to use to reach their goals (White & Roth, 2009), they may struggle to use the right keywords to find appropriate information. This situation is most common when a user is unaware of the jargon in a certain field but still needs to explore certain knowledge within that domain. For example, when studying the ethics of electronics addiction, an individual might have difficulty finding relevant information without knowing the term "attention economy", which is a necessary keyword to access that knowledge.

2.
Output: Users are no longer looking for a single answer (Bickham, Bradburn, Edwards, Fallon, Luke, Mossman & Ness, 2008); they are discovering knowledge and exploring topics to improve their understanding and decision making. They will pore over many results rather than considering only one or two at the top of the list. In this case, the SERP ranking is not ideal for exploratory searches. Users do not expect one optimal result but are instead interested in the context of that knowledge base, in which they can quickly glimpse the whole and follow a direction that piques their interest.

3.
Interactions: The exploratory search process can transform into an iterative and interactive experience in that what people discover from their initial trial search returns changes in subsequent searches (Morville & Callender, 2010). Often, users learn something from the first results page that helps them rephrase the query to obtain more precise results. Also, users may have further questions when digesting the first results page and then conduct a new search to gain further information.

4.
Digestion: Exploratory searches do not end at getting the result because the goal is to learn. Frequently, users need to digest all information received and assess their own context of a certain domain. They may also need to analyse the information they already have and compare it to evaluate the value of resources and determine what more they need. A common example is using Evernote or other resource management tools, such as mind mapping, to contextualize the information users have already.
Users tend only to end a search once they completely understand their scope of knowledge.

Results
Extensive research and seminars have been conducted around the topic of exploratory searching by other researchers, and many solutions have been proposed to improve it. Facet Search is a solution to refine users' input when conducting an exploratory search. Facet Search suggests considering keywords when retrieving information and enables users to access information by inputting queries from other dimensions (e.g., when the information was published, the author of the article, and the language of the text) (Tunkelang, 2009). Facet Search has been widely applied across many search engines and has mostly been used as a SERP filter.
Thesauri Search is another solution that searches by a user's exact keywords and considers synonyms that may be related. Like Facet Search, this method has been widely adopted to optimize search results in many search engines. Other solutions include "Keyword Suggestion," which suggests keywords when users type in queries, and "People Also Search," which lists what other people's related searches have been based on a user's input.
Though quite few solutions for exploratory search has been integrated into current search engines, it still has many possibilities. The main problem is that current solutions focus primarily on "input" that helps users form a query. However, the output, interaction, and digestion aspects of exploratory search remain undeveloped, which are also essential for retrieving the most profitable content for users, and for the overall search experience.

Discussion
From an information foraging perspective, profitability is a key factor in the information consumption, which is the main reason behind Google's success during the programming era (1950s-present). Google retrieves the most valuable results by adopting a PageRank algorithm that evaluates the importance of web pages to better rank search results to fit users' expectations. However, this profitability is one reason why Google-like searches (i.e., the queryresponse paradigm) may lose its edge in the current knowledge age and cognitive era (2010present).
Users' information needs were restricted to known-item searching during the programming era, when machines could only store information that was not linked or structured. However, when shifting to the Knowledge Era with AI as the leading technology, machines can now organize unstructured data and make decisions via natural language processing and knowledge representation. Machines can conduct known-item searches proactively to present users with the information they need after analysing and predicting their information behaviour, rather than after users perform a search action. These types of developments forecast the decline of traditional Google-like search engines.

Conclusion
As machines possess beyond-human knowledge now, users' information needs have grown to acquire knowledge from the search process. Essentially, users' active search behaviour has shifted gradually to exploratory searching, the goal of which is to learn knowledge. A new search engine is hence needed to improve the search experience. Possibly, by analysing the information search process, the features of a new search engine could focus on four points:

1.
input: enabling users to search when they do not have a clear goal or keyword in mind; 2.
output: helping users gain contextual familiarity with a topic before they select an area of interest;