Evaluation 1

From CS2610 Fall 2017
Jump to: navigation, search

slides


Readings

Reading Critiques

Jonathan Albert 17:58:36 10/16/2017

Evaluation: This paper discusses proper ways to evaluate UI toolkits and other such complex systems considering the state of modern technology. It discusses why conventional evaluation methods are insufficient and proposes new tests for viability. The principles of simplifying interconnection and combination mentioned near the end of the paper are implemented in many modern programming environments. For example, Java and its cousins operate on the JVM, and .NET Framework code compiles to MSIL for the CLR. By making frontends and backends meet at a single point, designers of new languages can focus on that single point of interoperability. I see the value in such an approach which enables supporting a wide array of new hardware and software without combinatoric complexity. The author's admonition to evaluate a prospective system's importance was likewise insightful. To introduce a paradigm shift into a user's workflow, the gains must substantially outweigh the retraining costs. The 100% factor of increase may seem high, but it makes sense when a new system has to also confront a sunk-cost bias. Without great benefit, it is understandable that, for instance, Dvorak keyboard layouts are uncommon compared to the widespread QWERTY despite being more efficient. ---- Methodology: This document covers broadly the subject of how experiments are conducted--what methods are used to collect and analyze data. It stresses the limitations and tradeoffs associated with various methods. The discussion of baserates was informative; failure to consider them accounts for many incendiary or ill-formed reports. Asserting a measure is "high" or "better" makes no sense apart from a reference point. The author does acknowledge, however, that baserate data alone will not eliminate mis-reported data or biases. Nevertheless, striving to make experiments more "tractable" that others may gain from them is important. The author makes a similar principial statement when discussing random variables: that the randomness is not inherent automatically in the variable itself, but is rather indicative of the selection procedure. This introduces a healthy degree of skepticism for one's own methods of testing, and helps to motivate deeper consideration about the method of the method. In other words, scrutinizing and attempting to eliminate confounding or happenstance biases by a second-order analysis could be quite beneficial.

Tahereh Arabghalizi 15:50:05 10/17/2017

Evaluating User Interface Systems Research: This paper represents a variety of alternative standards by which complex systems can be compared and evaluated. These criteria are not novel but recently have been out of favor. The authors emphasize to avoid the trap of only creating what a usability test can measure. They also say that we should avoid the trap of requiring new systems to meet all of the evaluations required. This would recreate the fatal flaw fallacy. The claims made by such systems should be compared to STU which are Situations, Tasks and Users. The authors claim that systems should be evaluated based on users’ goals in terms of importance, unsolved problems, and generality. My opinion about this paper is in real world, the authors restated the shortcomings of user interface toolkit evaluation and how some errors can be avoided in the future. ------------------------------------------------------------------------------------------------ METHODOLOGY MATTERS: DOING RESEARCH IN THE BEHAVIORAL and SOCIAL SCIENCES: This paper defines three domains of things: Substantive (object of a study), Conceptual (ideas about or properties of the substantive domain), and Methodological (techniques to do the research) and then explains them in detail. Using multiple methods can enhance confidence especially when the different methods compensate for each other’s limitations. There are three features that are desirable while collecting evidence for the study: Generalizability (on relevant populations), Precision (of measurements), and Realism (a contrived research setting may not translate convincingly into a real-world conclusion). However, no methodology can achieve all of these features together. Then the authors introduce three important comparison techniques: base rates (notable observations can only made when context is available), correlations (frequently misunderstood as causation) and differences (studying the interaction effects of different variables on each other). Later on the authors discuss about four different types of validity, six different types of measures, and ways to manipulate variables but the major message that the author tries to convey is that all of these techniques explored could be “best for something and worst for something else.” I think that this paper and the previous one are have created awareness and understanding of the different research methods and evaluation techniques.

Xiaoting Li 10:21:55 10/18/2017

1. Evaluating User Interface Systems Research: This paper introduces the importance of UI system research. The author points out several evaluation errors that caused by misapplied evaluation methods. In addition, the author presents the metrics of evaluating a user interface system. The structure of the paper is clear and easy to follow. The good take-away message from this paper is the methods of evaluation. The correct way to evaluate a new system is not to compare it with any existing system but to invite users who have little knowledge about the new system and existing systems to use these systems. If users have better performance on the new system, then we can say that the new system outperforms the existing ones. 2. Methodology Matters: Doing Research in the Behavioral and Social Science: In this chapter, the author gives us detailed introduction of research methods, research strategies, how to make study design, how to compare different techniques, definition of different types of validity, classes of measures, and techniques for manipulating variables. The important take-away message from this chapter is that results are dependent on methods. Methods are not perfect. Each method has its own advantages and disadvantages. If we want to get persuasive results, we need to compare different methods and choose the ones that are suitable for the study. Before reading the chapter, we usually apply one method in one study. However, the paper mentions we can use several diverse methods together so that strengths of some methods offset weaknesses of others. This is what we need to pay attention to when we do research study in the future.

Spencer Gray 14:31:08 10/18/2017

In the first paper, Methodology Matters, the author studies how scientists do resarch in the context of social sciences. He identifies the basic features of content, ideas, and techniques, and expands them into the domains of substantive, conceptual, and methodological. The author explains and discusses each domain as he goes into greater detail on the methodological domain. To me, this paper has little significance in the HCI field. The author took a fairly obvious concept and explained it in a confusing way. While understanding what makes research and evidence strong is important, all researchers understand the benefits and drawbacks of certain approaches. For instance, all researchers should know that self-reports will have bias and that obervations can be inaccurate due to human error. In the second paper, Evaluating User Interface Systems Research, the author explores the mistakes made in evaluating UI systems, and then makes recommendations on how to evalute these systems. The author notes that current techniques prevent accurate measurements and also prevent large innovations. Researchers focus too much on measuring usability in a quantified manner. They try to find participants for their studies who have never used their system or the one they wish to compare against. However, this misses the target of who the UI systems should be geared towards. New systems should be geared toward expert users. The author argues that a new system needs to be twice as good as the previous system for expert users to bother learning how to use it. This paper is significant in HCI research because it suggests many other effective tools to measure a UI system that are more qualitative than quantitative. Many researchers will only pursue a project that can show its benefits over another system in a graph, or some other numerical measure. This encourages small innovations and discourages large scale meaningful projects.

Mingzhi Yu 18:27:40 10/18/2017

Evaluating User Interface Systems Research: This paper is more like a summary of the evaluating method of User Interface from the past and current. The author discussed several flaws of evaluation principles that from the past and discussed how these flaws prevented the UI evaluation from developing. It mentioned 3 aspects: the usability trap, fatal fallacy, and legacy code. In general, except the old legacy code, the biggest barrier here seems the experiment assumption is not perfect. Besides that, the authors summarized and scrutinized many possible claims about the evaluation frameworks. I am interested in the point of the express match. It seems that today the express match happened more often because of the competition between different developers. It is hard to evaluate who won the express match if the difference is trivial. The paper mentioned the test of "design flaw challenge", which is also very interesting. Methodology matters: doing research in the behavioral and social sciences: Even though working in the field of social behavior science is different from the computer science, there are still some connections. Especially for HCI, it sometimes seeks for the solution and discussed the relationship between human beings and computer science from either psychology or behavior aspects. Therefore, when the authors talked about the details of a good methodology in the social science field, I can always relate to some experiment or even product of the HCI field. This book ( or chapter) presents some tools, strategies, tactics, and operations. It also mentioned the inherent limitation and potential strengths, which I have the same feeling when designing my own experiment. And the interesting point is, it mentioned the solution as using multiple methods to improve confidence. This sounds like the idea of using multi-modal to model a problem. Besides that, it also gives a good summary of using the statistic significance to prove the hypothesis. In general, this is a good article that summery some important ideas of choosing research methodology.

MuneebAlvi 18:50:37 10/18/2017

Critique of Methodology Matters Summary: This reading argues that research is only as good as the methods it uses. Research data also contains the flaws of the methods used to collect the data. I think the topics in this paper are very relevant to the software we are using today. I am sure some research and testing happens internally at companies like google or facebook before they release their next product or update. However, they typically find varying results once they release it to the public. For example, once a video game or console releases, the companies usually find out that the general public is playing the game very different than what they intended. This leads to users finding bugs or breaking the game completely. I think this is because a lot of the research the companies do internally do not have external validity. Another argument might also be that the companies cannot anticipate which kinds of users or the number of users that will use the system which also limits the external validity. The reading also mentions self reports by users. In this way, a lot of websites and digital services have surveys from time to time to help determine the users experience. This may not lead to the most precise results as many users might answer relatively fast to get back to what they were doing. On the other side, sometimes the users' first reactions can be the strongest. Critique of Evaluating User Interface Systems Research Summary: This paper argues that usability is not a good enough anymore to measure modern UIs which are more complex than earlier UIs. This paper mentions that many UIs, at the time the paper was published, were still designed with the mindset that users are using computers for the first time. However, I don't believe this is the case anymore. Many UIs on mobile devices assume that users have used touch devices previously. This might be one of the reasons that the button affordance has been taken away by iOS and Android. Desktops have also evolved to assume that the user is accustomed to using desktops. In fact, many Desktop OSes integrate mobile features into them. For example, windows 10 is on many devices which can change from a laptop to a tablet and the user interface changes accordingly. Another example is that macOS integrates handing off from the iPhone which means a user can start an email on the iPhone and finish it on the laptop.

Sanchayan Sarkar 19:48:44 10/18/2017

CRITIQUE 1 (Methodology Matters: Doing Research in the Behavioral and Social Sciences) This chapter is a fundamental work on categorizing the various elements that go in designing experiments, obtaining its validity and the different performance measures that are used to analyze a research study. The biggest strength of this paper is encapsulating the various dimensions of research strategies in a diagram and categorizing under which quadrants each strategies fall. Figure 2 of this chapter demonstrates this perfectly. Indeed, the three aspects of a research work is the generalizability, the measurement precision and the realism of the situation. No experiment can fully maximize the three dimensions. In my work on Human Face Recognition, I had faced a similar dilemma when my algorithm was robust to various input scenarios but was unable to beat the highest precision on certain scenarios. Hence my work sacrificed precision for generalizability. The paper projects these scenarios in a great way. Not just these dimensions, the paper also produces categories of obtrusiveness, abstractions and concreteness. Every experiment whether it be Judgement study, Sample survey or a field experiment falls in this spectrum produced by these dimensions. The next important merit of this chapter is the comparison techniques such as base-rates, correlations and differences. These are essential in almost every experiment. Also, the paper goes over the distinction between “randomization” and “True Experiment” as often situations considered to be random are not random in real as they undergo some sort of bias. Another merit of the paper is explicitly illustrating the different validity scenarios: external and internal validity. It built upon Campbell’s categorization of validity and the threats the can occur within the duration of the experiment. Finally, it enlists the types of measures: self-reports, observations, trace measures and archival records. It is interesting that methods involved in manipulating variables can border on the edge of unethical practices. In my opinion, this chapter is quite exhaustive in presenting the vocabulary, the borderline scenarios of every concept for designing and evaluating different experimental studies. Had this chapter presented more illustrative examples for every scenario, it would have been much better. However, it is a minor flaw and this chapter is ultimately critical for people who desire to construct experimental studies and evaluation scenarios in a formal and systematic way. **************************************************************************** CRITIQUE 2 (Evaluating User Interface Systems Research) In this paper, the author posits the need for evaluation for complex User-interface systems, its importance, the pitfalls for setting up certain evaluation criteria and the different parameters that need to be set up for better evaluation. One of the merits of the paper is the explicit citing of the different errors that can occur while evaluating complex UI systems. First is the usability trap which comprises standardization, no pre-existing training and scale of the problem. Most modern systems violate these three assumptions and therefore evaluation scenarios must defer this. Second is the “Fatal Flaw fallacy” which insists the designers to account for all of the errors. Third is legacy code. The two later assumptions are impossible for modern systems which are so high in the number of components that pre-accounting for the range of errors are impossible. Another important aspect is the role of Situations, Tasks and Users (STU) in evaluating systems. The paper does well to demonstrate that systems which can validate the importance novelty and generality of the System on the diversity of user population, target tasks and varied situations will fare better than the others. Finally, the paper’s view on the Expressive leverage of the system is interesting as a system that re-uses, implies design choices that correspond closer to the target solution is a positive since it reduces the cognitive load on the user’s side to restate the solutions. The paper discusses other factors like combination of primitive building blocks, allowing a better platform for designers and scalability of the system. In all, the paper is very detailed and categorized as far as the vocabulary of evaluating systems is concerned. The only negative that I found out was the lack of convincing examples to differentiate the sub-categories of evaluating systems. Had the paper had been more convincing in its examples, it would have been better. Never the less, this can be considered as a classical paper for anyone seeking to set up good user studies and evaluation scenarios based upon that.

Charles Smith 22:20:00 10/18/2017

On: Methodology Matters This paper looks at the conducting of experiments. It gives a couple guides to consider when you’re experimenting. One idea that it conveyed was that there are many ways to conduct a survey, whether that be in a lab, or out in a field. These each have their own benefits and weaknesses. One thing that I have not considered before is that, as the author states, you don’t have to try to make your choice of experiment weakness-free. As long as others in your field can repeat your study, taking those weaknesses into account. Another point the author makes is that your research pool probably won’t ever be actually random. It is good to consider how you selected you participants and what effects that could have on your study. A non-totally-diverse group of participants can still be a part of a valid study. On: Evaluating User Interface Systems Research The author of this paper is stressing that usability testing is not the only way to evaluate an interface. He also gives some other ideas that stop some researches, that shouldn’t Usability testing seems like the easiest, and best, form of testing on first glance. It provides clear statistical numbers, and can easily be used to show improvements. But usability testing is not the whole picture, and shouldn’t be treated as such. The author also states how the existence of legacy code, and the desire to integrate with it, shouldn’t stop researchers from trying new ideas. Legacy code is certainly important to today’s computing, as throwing out lots of already developed code can be very costly and time consuming. Even newer items, such as android and iOS, use legacy code dating back before 1999 (18 years). Drastic changes that remove the usefulness of this code also bring on the cost of new development in other areas. While in the long run new ideas throwing out this old code may be better, a company that seeks to do just this, may run out of funding developing new code that they may not be able to see this future.

Xingtian Dong 23:22:37 10/18/2017

1. Reading critique for ‘Methodology matters: Doing research in the behavioral and social sciences’ I think this paper is still useful to some extent. The paper introduced some methods for experiments. And the author also listed the advantaged and disadvantages of the methods and mixed methods can overcome some limitations of single method. This is helpful for us to design an experiment. The author also provide some research strategies and no single strategy can maximize the desirable goals of precision, generalizability and result. Actually I this paper is too old. There might be more thorough methodologies for experiment design. But this paper is still valuable to some extent. Another useful part is about validation techniques, it allows researchers determine the reliability of an experiment. Besides, the last chapter gives different ways to measure variables from participants. Self-reports, observations and so on, are all very inspiring. It is useful for us to design evaluation for the project experiment. 2. Reading critique for ‘Evaluating user interface systems research’ This paper is very useful for us to learn how to evaluate user interface. The author somehow tells us some principles of designing and evaluating a user interface. He also presents some evaluation errors which let us know some common that should avoid. I think it is quite useful for student like us who just start learning interface design. And by the way, I think the low skill barriers sounds like low threshold which we learn in the previous paper. It is an important principle for designing interface. The author also points out the difficulties of usability studies for user interface systems. I wonder after ten years of the publish of the paper, are there any new methodologies to conquer these difficulties. Because nowadays there are many successful interfaces which have been used for years, how could we evaluate new interfaces which compete with these old interfaces? I think it should be a really interesting topic. And the most significant part of the paper is that the author formalize the situation, tasks and users context.

Ahmed Magooda 23:53:28 10/18/2017

Evaluating User Interface Systems Research In this paper the authors discuss different approaches for evaluating new user interface systems and make sure that the system actually adds a new value. The author argues that most of the people focuses only on usability as the sole measure of UI system, however for complex systems it shouldn't be the case any longer. The author then discusses why UI systems are advancing while existing systems should be suitable for users, and the author actually summarize this phenomena into few points (user needs is evolving, hardware and software restrictions are no longer there, etc..). The author then discusses some problems that we can face when evaluating a system, these problems are (the usability trap, the fatal flaw fallacy, and legacy code) Furthermore the author argued that we should know about the situation, tasks, and users. By knowing this three concepts we can talk about the importance of the system. It is also worth mentioning that tools can also introduce new populations to the UI design process in addition to programmers. At the end, the author says that some tools can be effective by supporting a combination of more basic blocks or simplifying interconnections between components. ------------------------------------------------------------ METHODOLOGY MATTERS: DOING RESEARCH IN THE BEHAVIORAL and SOCIAL SCIENCES In this paper the author discuss the nature of research process and important elements in the research process(content, assumptions, tools, methods and procedures). the author then discuss some interesting findings (Methods enable but also limit evidence, All methods are valuable but all have weaknesses or limitations, you can offset the different weaknesses of various methods by using multiple methods). Then the author continues on what kind of research methods we should use as well as what kind of research strategies we should use when we try to conduct a behavioural or social science research. Then the author provides some specific strategies with details. the author then talks about classes of measures and manipulation techniques like potential classes of measures in social psychology, along side discussing classes of data collection like (self-reports, trace measures, observations by a visible observer, observations by a hidden observer, public archival records, and private archival records). analysis of these measures is discussed. The author then concludes the work by emphasizing that (Results depend on methods. All methods have limitations).

Krithika Ganesh 0:26:18 10/19/2017

Methodology matters: Doing research in the behavioral and social sciences: This paper points out tools strategy, tactics operations and some of the inherent limits and potential strengths of various features of the research process by which behavioral and social scientists do research. The basic features of the research process are: content that is of interest, ideas that give meaning to the content and techniques or procedure by means of which those ideas can be studied which formally define substantive domain, conceptual domain and methodological domain. While choosing a setting for a study we need to try to maximize one of the following: generalizability, precision and realism. Some of the major research strategies are: field strategies, experimental strategies, respondent strategies and theoretical strategies. The comparison techniques can be done based on baserates, correlation question and difference question. This paper stressed on the importance of randomization, sampling, allocation and statistical inference. It also explains the types of validities: internal, external and construct validity. The potential classes of measures: self-reports, observations, archival records, trace measures and their strengths and weaknesses were explained. Finally, the techniques for manipulating variables; selection, direct intervention, inductions were explained. This paper is an eye opener to HCI researchers to make the right decision while performing experiments. Overall, I learnt a lot and the most important take away was that collecting evidence should not be viewed as a limitation and before conducting a experiment, we need to understand the nature of the experiment, see what category of research it falls and make the right decision. ------------------- END OF FIRST CRITIQUE--------------------------- Evaluating User Interface Systems Research: This paper explored the shortcomings of evaluating UI systems and comes up with a set of criteria to tackle the shortcomings. Due to stable platforms like Windows, lack of skills of new generation of researchers and lack of appropriate criteria for evaluating systems architecture, the author states that there has been a decline in the creation of new architectures for interactive systems. The paper states that UI systems require a change because the existing hardware have gone through a lot of changes and current generation are more tech savvy. Some of the values added by the UI systems architecture are: reduce development viscosity, enables scale and leverages the power of common infrastructure. The author cautions the Ui developers to not fall into the usability trap, fatal flaw fallacy and not just go by legacy code. The shortcomings of evaluating UI systems can be tackled by considering the STU model, giving importance (to user population, target tasks, set of situations), looking into problems that have not been solved, increase generality by increasing diversity and increasing number of demonstrated solution, reduce solution viscosity, empower new participants by showing them why they should involve in the UI process and how the new tools are easier and more effective to use, applying the power of combination by combining a set of primitive design components into more complex designs and making the interconnection more simple and straightforward. What I learned from this paper is that when we create systems, we don’t just try to meet all the evaluations, we just question ourselves: “Has important progress been made?” – YES? Then happily take our share of the new knowledge and move forward in fixing what went wrong by filling the gaps.

Yuhuan Jiang 0:50:20 10/19/2017

Paper Critiques for 10/19/2017 == Evaluating User Interface Systems Research == This paper focuses on the evaluation of the of user interface systems. Specifically, how to evaluate if a new interface is actually an improvement. Some metrics for evaluating complex systems are proposed by this paper. Take flexibility as an example, a UI tool is considered flexible if it is possible to make rapid design changes that can in turn be evaluated by users. The authors also define some claims that a system can make. For example, for a system to claim that it is able to be composed of smaller basic components, it needs to show that the set of primitives is either relatively small or they can be easily extended. The authors remind readers of the trap of creating new user interface systems for the sake of meeting all the evaluations metrics. == Methodology Matters: Doing Research in the Behavioral and Social Sciences == This is a book chapter about key characteristics of research works in the behavioral and social sciences. In these sciences, since any methods have limitations, the results are also limited, since results depends on methods. These sciences may not have the absolute rigorousness that mathematical research has, thus researchers should realize that it is not possible to maximize all desirable features of their method. Some tradeoffs and dilemmas are often involved. Also, each set of results in each study must be interpreted in relation to other evidence which bear on the same questions. The chapter discusses various types of measures that are often adopted by behavioral and social sciences, such as self-reports, observations, trace measures, and archival records. The chapter also discusses how to manipulate variables. Some approaches include selection, direct intervention, and induction.

Mehrnoosh Raoufi 1:26:09 10/19/2017

Methodology Matters: Doing Research in the behavioral and social sciences: This paper is mainly about strategies should be taken by researchers in social and behavioral sciences. First of all, it introduces three main domain in research. First is the substantive domain that is about the content of research interest. For example, conversation in a family about buying a new car is a content. Second is a conceptual domain that is the ideas forming the result of research. The last one is the methodological domain that discusses methods and techniques of doing the research. For example, a set of procedures for observing family discussions. After explaining each one, the paper focuses on the methodological domain and argues methods are both opportunities and limitations. That is why methods are considered bounded opportunities. The paper acknowledges that each method has weaknesses. It is not realistic to have a flawless method but what we can do is using several methods in a manner that they compensate each other weaknesses. The paper categorizes research strategies fro, three different perspectives: the actor that refers to human systems, behavior that refers to all actions of human systems in the region of research interest, and the context that refers to all the relevant features.Then it introduces desirable criteria in research evidence that should be maximized; generalizability of evidence over actors, the precision of measurement of behaviors, and realism of the context within which evidence is gathered. At last, the paper talks about four different types of strategies; field strategies, the experimental strategies, the respondent strategies, and the theoretical strategies. --------------------------------------------------------------------------------------------------- Evaluating User Interface Systems Research: This paper mainly dicusses how to evaluate new user interface systems architectures. Before it goes to address this question, it takes a look at the reasons we need new UI systems architectures. It explains that although existing UI comfortable for users, they cannot last forever because other aspects of systems are changing in order to keep up with these changes we require to develop new UI system architectures. Then, the paper argues about common errors in the area of UI system evaluations that researchers may get into them. These errors are user testability trap, the fatal flaw fallacy, and legacy code. After identifying errors and trying to avoid them, the next step is to identify the claims we are making upon our new systems. Each new interactive systems fit a particular set of users and a set of the task under some situations. We call these three elements STU (situation, task, user). Different claims address these three factors in a different way. The paper talks about claims such as importance, generality, reduce solution viscosity, and power in combination. At last, the author mentions that the goal of evaluation is to answer this fundamental question: "Has important progress been made?"

Ronian Zhang 2:08:38 10/19/2017

Evaluating User Interface Systems Research: This paper talks about ways of evaluating user interface in complex systems. The paper starts by addressing the new challenges in hardware & os: the constrains of hardware, assumptions about user, new input technology, new platform, and challenges in UI sys architecture: reduce dev viscosity, least resistance, lower skill barrier, new advanced tech, scalability. Then, he points out common evaluation errors: usability assumptions (specialized expertise of prerequisite, standardized task, scale of the problem), fatal flaw fallacy (omission of important features), legacy code (one barrier). Then, he discusses the evaluating method: Importance (user population, importance of tasks to user, frequency of users are in the situations, difference that new tech may make), New solution to problem that hasn’t been solved before, How general is the provided solution, solution viscosity (flexibility, reduce of the choice, distance of expressed design to problem solution), Could it lower the threshold of new participants, The effectiveness of combining basic built blocks (mechanism of combing primitive components into complex design, reduce the cost of producing new components, the ease of combination), Scalability. I think the paper is inspiring, to make better UI interface, how to evaluation whether UI interface is good is the first step and act as the foundation, only in this way can we judge the emerging UI and better apply the technology. ————————————————————————————— Methodology Matters: Doing Research In The Behavioral and Social Science: The paper talks about 3 basic settings of the research process: content (substantive domain, the subject of study), ideas (conceptual domain, the properties of the states and actions of the focus, the relationship is how elements are connected), techniques | procedures (methodological domain, various methodologies of measuring the feature, the relationship is the application of various method in valuing different features). The author also addresses some of the issues: The dilemma of research methods: enable and also limit the knowledge, methods have both weakness and merit when applying, usage of multiple methods could leverage the weakness, but in this way the strength of some may also be offset by the weakness of the others. The research strategies: field strategies (make direct observations while limit the intruding of the systems), experimental strategies (gaining the precision while maintaining the realism of the context), respondent strategies (high on precision, low on generalizability), theoretical strategies. Comparison technologies: baserates (the unknown of frequency lead to less knowledge about the possibilities), correlations (whether 2 variables have dependent relationships), difference (comparison). Validity: internal (how sure about how a influence b), construct (how well is the definition), external (whether relationship still holds when doing replication), threats. Class of measures and their strength & weakness: self-report (low dross rates, may not be the respondent’s real thoughts), observations (vulnerable to observer errors), archival recodes (analyze the existing archives, unobtrusive but not nearly versatile and loosely linked to the measure problem), trace (low versatility, high dross rates, loose linkage). Techs for using variables: selection (convenient, but may limit the data set), direct intervention (easy to get definite manipulation, but limited on tangible variables), induction (specific, but may have validity problem).

Akhil Yendluri 2:15:53 10/19/2017

Methodology Matters: Doing Research in the behavioral and social sciences
This paper gives the basic idea of how to do a research and the various factors involved while doing a research. The paper starts with a basic description of the features involved in research and explains the three domains of research, Substantive domain, Conceptual domain and Methodological domain. It further describes the various techniques involved in doing a research such as Sample survey, Experimental Simulation, Laboratory Simulation, etc. The author explains the advantages and disadvantages of each method before concluding that a research must be performed in multiple methods in order to compliment the strengths and weaknesses of each method. This builds up the confidence in one's research. The author then explains that randomization is the random assignment of cases to conditions. He explains why a random allocation doesn't necessarily guarantee equal distribution extraneous factors. Validity is central to the research process and the author explains the various ways of finding validity such as Internal Validity, Construct Validity, External Validity and Threats to Validity. This chapter has helped me understand on how to truly conduct a research and the various methods involved in it.
Evaluating User Interface Systems Research
This paper discusses the various challenges involved in the development of User Interface Systems. The author starts with the history of User Interface, on how there was extensive research from the early 70's till the 90's. The author states that with the rise of Microsoft and Apple and the mellowing of the User Interface Input tools, research on this topic had started to fade. People didn't have the skills required as the current day computers have hardware capabilities far more advanced than what we can make use of. But this was not the case earlier. The author has given in depth explanation of how to evaluate effectiveness of system and tools. It is important that the User Interface is flexible and expressive enough to adapt to changes and make do complex designs. It must have capabilities to simplify complex tasks and help bring new users into the field. It must also be scalable enough to adapt to changes henceforth.

Ruochen Liu 2:55:14 10/19/2017

1. Methodology Matters: Doing Research in the Behavioral and Social Sciences: Doing research in the behavioral and social sciences is always crucial to the human-machine interaction area. This chapter of the book written by joseph E. Mcgrath mainly talks about the methodology and tools of the research in the behavioral and social science. First, the basics of the research are mentioned: (1) some content that is of interest (2) some ideas that give meaning to that content (3) some techniques or procedures by means of which those ideas and contents can be studied. And more specifically, they are the substantive domain, the conceptual domain and the methodological domain. The elements of these three domains are phenomena, properties and modes of treatment and the relations are patterns, relations and comparison techniques. Personally, I learn some pieces of research strategies from this chapter. Such as one of the fundamental dilemma of the research process that the generalizability of the evidence, the precision of measurement and the realism of the situation can not be maximized simultaneously. The only thing I can do is to increase one of these three features and reduce one or both of the other two ones. The designs, methods, and strategies constitute a powerful technology for acquiring information about phenomena and relations. 2. Evaluating User Interface Systems Research: This paper is a research report about the evaluation methods of future user interface systems. First, the author presents the current problem that the future development opportunities of interface systems will come from the new off-the-desktop, nomadic or physical in nature systems. While the current simple usability testing can not fulfill the requirements for the evaluation of complex systems. Then the author presents many alternative standards focusing on the comparison and evaluation of future complex systems. Avoiding the usability trap is another thing that the author stresses on. The usability experiments are concluded to be built under three key assumptions: walk up and use, standardized task assumption, and the scale of the problem. But those assumptions are rarely met by toolkit and UI architecture. As the opinion of the author, the evaluation should focus on the improvement on the system itself, which is the key to evaluation strategy, rather than rely on the previous usability experiment too much.

Amanda Crawford 6:34:08 10/19/2017

Methodology Matters: Doing Research in the behavioral and social sciences, Joseph E. McGrath, in Readings in Human-Computer Interaction: Toward the Year 2000, pp. 152 - 169. McGrath's chapter on "Doing Research in the Behavioral and Social Sciences" address the questions: What is research? How does one conduct proper research? His clear definition of research points out how the systematic use of theoretical and empirical tools are the main components of the research process. He believes that that by carefully scoping out the tools that one uses for their research, a person is able validate or invalidate the results of their experiment. McGrath explores the research process from the substantive, conceptual, and methodological perspective. His main focus, however, is within the methodological domain. This chapter gives us descriptive definitions of research strategies, comparison techniques, randomization, validation, manipulation techniques. It gives a great representation of the research process and constructively guides us on how to think like a researcher. McGrath not only describes how to use the tools but within each feature, he walks us through their limitations, followed by the prescriptive measures. ********** Evaluating User Interface Systems Research, Olsen, D.R., Proc of UIST 2007, pp. 251-258. This paper gives us some insight into how to conduct evaluative research in the human computer interaction domain. Much like McGrath, Olsen discusses the faults in improper use of usability, fatal flaw, and legacy code evaluation methods. He lessons us on how do we evaluate the claims of effectiveness of a UI system. Surprisingly, the evaluation methods proposed are closely aligned with McGraths classification of generalizability methods. McGrath stresses that if one chooses to maximize generalizability, then they are minimizing realism and precision. We can see that this partially holds as Olsen addresses that solutions who have a greater diversity in use are more effective to those that are limited. Although this claim may be important, it imposes constraints on other measures such as creating systems for more advanced users, which may require research evaluation methods that embrace precision. However, this paper is a good way of examining a UI system as a research scientist to further find improvements.

Kadie Clancy 8:30:20 10/19/2017

Evaluating User Interface System Research: Technologies, including user interface technologies, progress based on the ability to evaluate new methods to ensure that progress is being made. But simple usability testing is not adequate for creating complex interactive systems, as they resist simple controlled experiments. The article presents several standards by which these complex systems can be alternately compared. All new technology addresses a set of users performing tasks in certain situations; this STU context forms a framework for evaluating system quality. A complex system can be evaluated in the following ways: importance, ensuring that the problem has not been previously solved, generality, reduction of solution viscosity, flexibility, and expressive match. This paper is important as it presents new and appropriate ways of evaluating new complex systems. If all systems were to be designed to yield to controlled experimentation, no real progress in the field of User Interface systems would be made since system the complexity would likely confound these controlled experiments. I also think it is important that the author warns of requiring a new system to meet all evaluation standards described in the paper. As long as genuine progress is being made, some evaluation metric should be able to convey this fact. Methodology Matters Doing Research in the Behavioral and Social Sciences: This chapter discusses the tools in which behavioral and social science researchers conduct research. The basic features of the research process consist of the substantive domain, the conceptual domain, and the methodology domain. The substantive domain, refers to phenomena, in the context of behavioral and social science states and actions of human systems. In the conceptual domain, the objects of interest refer to familiar concepts like “attitude” or “power” - the actions that are the focus of the study. In the methodological domain, elements are methods. The author emphasises that results all depend on these methods and all methods have limitations, so any set of results will be limited. This chapter also discusses study design, comparison techniques, forms of validity, techniques for manipulating and controlling variables. The chapter is important as it provides detail on how to conduct research in the behavioral and social science domains which are highly applicable to the field of HCI and can guide research in the field. I think it is also important that the author emphasizes that any body of evidence is to be interpreted based on the strengths and weaknesses of the of the method and conceptual choices that it employed and that evidence is reliant on all of those methodological choices and constraints.