CiHi: A Tutorial Supervisor for automatic assessment in educational systems.

A Tutorial Supervisor for automatic assessment in educational systems. In the face of growing economic pressures, Leeds MetropolitanUniversity Coordinates: Leeds Metropolitan University is a university with campuses in Leeds and Harrogate, Yorkshire, England. is seeking to use educational technology and computer-basedlearning to solve the practical problems of providing for increasingstudent numbers given static or decreasing resources, a widespreadproblem shared with many other educational institutions. Of particularinterest is the educational use of hypermedia hypermedia:see hypertext. The use of hyperlinks, regular text, graphics, audio and video to provide an interactive, multimedia presentation. All the various elements are linked, enabling the user to move from one to another. . This is seen as beingpotentially valuable in that there is evidence to suggest that learningis improved when a student is allowed to follow pathways of their ownchoice, at their own pace, and able to monitor their progress by instantfeedback questions (Theng & Thimbleby, 1998). ********** A major concern, however, with the educational use of hypermedia isthat of user navigation. The issue is how, on the one hand, to preventthe user from becoming overwhelmed o��ver��whelm?tr.v. o��ver��whelmed, o��ver��whelm��ing, o��ver��whelms1. To surge over and submerge; engulf: waves overwhelming the rocky shoreline.2. a. with information and losing track ofwhere they are going, while on the other hand permitting them to makethe most of the facilities the hypermedia offers. One approach to remedythis is to restrict the number of links made available to the student.The concern that this might lead to an impoverished im��pov��er��ished?adj.1. Reduced to poverty; poverty-stricken. See Synonyms at poor.2. Deprived of natural richness or strength; limited or depleted: set of learningopportunities can be countered by the use of adaptive hypermedia Customizing a link on a Web page based on the habits of the user. In classic hypermedia (classic hypertext), a link is a fixed address to a page or document. An adaptive hypermedia system tracks the browsing behavior of the user and can change the link to a different Web page or document ,seeking dynamically to configure See configuration. (software) configure - A program by Richard Stallman to discover properties of the current platform and to set up make to compile and install gcc.Cygnus configure was a similar system developed by K. the available links as the studentproceeds. THE EDUCATIONAL IMPORTANCE OF ADAPTIVITY Our argument for the importance of adaptivity in an educationhypermedia environment is based in part, on the claim of Elsom-Cook(1989) that the perfect tutoring system should be able to slide betweenthe two extremes of total constraint ConstraintA restriction on the natural degrees of freedom of a system. If n and m are the numbers of the natural and actual degrees of freedom, the difference n - m is the number of constraints. and total absence of constraint,according to according toprep.1. As stated or indicated by; on the authority of: according to historians.2. In keeping with: according to instructions.3. the students needs and current state of knowledge. Withthis in mind, one of our aims is to facilitate hypermedia systems Noun 1. hypermedia system - a multimedia system in which related items of information are connected and can be presented togetherhypermedia, interactive multimedia, interactive multimedia system withthe ability to adapt to their students needs as they progress throughthe system. To this end we have developed a prototype hypermedia shell,called Hypernet, which provides a combination of knowledge-basedrepresentations (which we refer to as "semantic See semantics. See also Symantec. hypermedia")and neural network neural networkor neural computing,computer architecture modeled upon the human brain's interconnected system of neurons. Neural networks imitate the brain's ability to sort out patterns and learn from trial and error, discerning and extracting models ("connectionist modelling") toprovide a structured environment, which is suitable for producingautomatic links through simple reasoning mechanisms (Mullier, 1999) andto provide the student with a structure that aids them in theirnavigation (Theng & Thimbleby, 1998). We argue that this approach overcomes two major concerns in thedomain of educational hypermedia. One is that since a hypermedialearning system shifts the responsibility for accessing and sequencinginformation from the teacher to the student, this may entail entail,in law, restriction of inheritance to a limited class of descendants for at least several generations. The object of entail is to preserve large estates in land from the disintegration that is caused by equal inheritance by all the heirs and by the ordinary a cognitiveoverload See information overload and overloading. : "The number of learning options available to learnersplaces increased cognitive demands upon the learners that they are oftenunable to fulfil ful��fillalso ful��fil ?tr.v. ful��filled, ful��fill��ing, ful��fills also ful��fils1. To bring into actuality; effect: fulfilled their promises.2. " (Jonassen, & Grabinger, 1990). A particularlyimportant manifestation man��i��fes��ta��tionn.An indication of the existence, reality, or presence of something, especially an illness.manifestation(man´ifestā´sh of this cognitive overload occurs when a userbecomes "lost in hyperspace hyperspace - /hi:'per-spays/ A memory location that is *far* away from where the program counter should be pointing, often inaccessible because it is not even mapped in. (Compare jump off into never-never land. ," not finding or being presentedwith the required information, a problem caused by the complexityassociated with "having to know where you are in the network andhow to get to some other place that you know (or think) exists in thenetwork" (Conklin Conklin may refer to: Conklin, New York Conklin, Michigan Conklin Guitars Patty Conklin - Founder of Conklin Shows Conklin Shows (Conklin Group, World's Finest Shows, Conklin Supershows, and Carnival Midway Management Company) - A North American amusement , 1987). Our prototype seeks to deal with thisproblem essentially by making more links available to students as theirknowledge of the domain is judged to be improving; in an attempt toprovide the sl iding scale approach advocated by ElsomCook (1989). Thenovice student of our system is therefore freed from much of thecomplexity associated with an unfamiliar system teaching an unfamiliarsubject, while the more advanced student is freed from unhelpfulconstraints CONSTRAINTS - A language for solving constraints using value inference.["CONSTRAINTS: A Language for Expressing Almost-Hierarchical Descriptions", G.J. Sussman et al, Artif Intell 14(1):1-39 (Aug 1980)]. . To decide how many links to offer a student, it is necessary tograde the student into an ability level. It is a requirement that thisbe achieved in an automatic way for the system to be standalone stand��a��lone?adj.Self-contained and usually independently operating: a standalone computer terminal.. It isfor this reason that our system includes a sub-system See subsystem. called theTutorial An instructional book or program that takes the user through a prescribed sequence of steps in order to learn a product. Contrast with documentation, which, although instructional, tends to group features and functions by category. See tutorials in this publication. Supervisor (TS) whose role it is to gauge the student'sability in response to the tutorials. It is anticipated that the TSsystem could be used outside of the Hypernet system in other, perhapsnonhypermedia based, educational systems. The TS has the potential to beuseful in situations where questions and tutorials can be automaticallymarked, such as multiple choice questions or keyword (1) A word used in a text search.(2) A word in a text document that is used in an index to best describe the contents of the document.(3) A reserved word in a programming or command language. 1. matching. Theremainder of this article will discuss the design and implementation ofour TS system and several advantages such as our TS's ability toregrade Re`grade´v. i. 1. To retire; to go back. To determine that certain classified information requires, in the interests of national defense, a higher or a lower degree of protection against unauthorized disclosure than misgraded questions and to adapt to domains of differentquestions. Hypernet also identifies browsing See browse. patterns that the studentmakes as they use the hypermedia and combines this information wi thability level information from the TS. This is done to attach TO ATTACH, crim. law, practice. To an attachment for contempt for the non- take or apprehend by virtue of the order of a writ or precept, commonly called an attachment. It differs from an arrest in this, that he who arrests a man, takes him to a person of higher power to be disposed of; meaningfulinformation to the browsing patterns. For example, if students who havehigh ability scores, as indicated by the TS, tend to use similarbrowsing patterns then this information is recorded and can be used toencourage beneficial browsing in future students or to infer an abilityfrom browsing pattern alone. This is discussed and experimental evidenceis presented in Mullier (1999) and Mullier, Hobbs Hobbs,city (1990 pop. 29,115), Lea co., SE N.Mex.; inc. 1929. With the discovery (c.1928) of oil and natural gas in the area, Hobbs became one of the last great oil boomtowns in the United States. It remains a major shipping and trading center for oil-well supplies. , and Moore Moore,city (1990 pop. 40,761), Cleveland co., central Okla., a suburb of Oklahoma City; inc. 1887. Its manufactures include lightning- and surge-protection equipment, packaging for foods, and auto parts. (2002). RATIONALE rationale (rash´nal´),n the fundamental reasons used as the basis for a decision or action. FOR USING A NEURAL NETWORK A neural network has been chosen to form the TS in preference to asymbolic rule-based system chiefly because, unlike a rule-based system,it is domain independent. It is unlikely that, for example, a high levelstudent would produce results in the same range for every type ofdomain. Thus the rule "IF SCORE >70 THEN LEVEL 10" is onlylikely to apply to the domain that it was initially defined for. This isthe reason why Bergeron The surname Bergeron comes from the country of France.The name Bergeron may be derived from the Old German word berg, meaning hill or mountain. It may also be derived from the old French berger, meaning shepherd. , Morse, and Greenes (1989) use a neural networkfor their TS. Their neural network holds the rules that it learned fromits training data (the first domain). It was then able to change itsrules in response to new data (new domains), by retraining re��train?tr. & intr.v. re��trained, re��train��ing, re��trainsTo train or undergo training again.re��train offline. Inthis manner the neural network can adapt to misconceptions Misconceptions is an American sitcom television series for The WB Network for the 2005-2006 season that never aired. It features Jane Leeves, formerly of Frasier, and French Stewart, formerly of 3rd Rock From the Sun. orinaccuracies in the original rules and adapt to new situations. Thiswould be a difficult and time-consuming time-con��sum��ingadj.Taking up much time.time-consumingAdjectivetaking up a great deal of timeAdj. 1. process for a symbolicrule-based system, since it would require the re-engineering re-engineering - The examination and modification of a system to reconstitute it in a new form and the subsequent implementation of the new form.http://erg.abdn.ac.uk/users/brant/sre. of therule-base by a knowledge engineer, since the new rules would have t o beidentified and then encoded. In essence, the neural network is doing thejob of the human rule designer. Designing such rules is not necessarilya simple matter, since it requires the human designer to examine manystudent interactions with various questions and tasks so that a validgrading of each question can be made (e.g., this question was answeredwell by novice students, it is therefore easy and can be presented toother novice students). The situation is further complicated by thepossibility that different populations of students (students fromdifferent classes or tutorial groups) may have different previousknowledge of the domains and therefore the initial question gradings maynot apply to them. It would therefore be helpful if an automated au��to��mate?v. au��to��mat��ed, au��to��mat��ing, au��to��matesv.tr.1. To convert to automatic operation: automate a factory.2. systemcould be employed to accomplish the task of dynamic question grading andtherefore remove some burden from the author. However, Bergeron etal.'s (1989) tutorial supervisor is unable to readapt Verb 1. readapt - adapt anew; "He readapted himself"adapt, conform, adjust - adapt or conform oneself to new or different conditions; "We must adjust to the bad economic situation"2. to differentdomains without being manually provided w ith new training data and thenretrained offline. The remainder of this article describes the neuralnetworks used for the TS used in our research system, called Hypernet(Mullier et al., 2002; Muffler muffler,in automobiles, device designed to reduce the noise from the exhaust of an internal-combustion engine. When the exhaust gases from an internal-combustion engine are released directly into the atmosphere, they create a loud noise, caused by the passage of the , 1999), which improves upon Bergeron etal.'s (1989) design by allowing the automatic online adaptation todifferent domains. Our TS improves on Bergeron's originalspecification by allowing the neural network to adapt to both studentsand questions/tutorials as the system is in use. TUTORIAL SUPERVISOR ARCHITECTURE The Multilayer-feedforward neural network (MLFF), sometimesreferred to as the Multilayer Perceptron multilayer perceptron - A network composed of more than one layer of neurons, with some or all of the outputs of each layer connected to one or more of the inputs of another layer. or Back-propagation back-propagation - (Or "backpropagation") A learning algorithm for modifying a feed-forward neural network which minimises a continuous "error function" or "objective function. network,was initially considered, since it has been mathematically proven that aMLFF is capable of performing any mapping function map��ping functionn.A mathematical formula that relates distances on a gene map to recombination frequencies; its graphic rendering shows that the recombination value of two genes is never greater than 50 percent regardless of how far apart the genes (Fine 1999). It istherefore generally accepted that the standard MLFF architecture is thefirst architecture to consider. However, the Kohonen Kohonen - T. Kohonen neural network (orself-organising feature map) neural network offers an additionalfacility that was considered useful in this case, namely the ability tocontinually con��tin��u��al?adj.1. Recurring regularly or frequently: the continual need to pay the mortgage.2. readapt without human intervention A procedure used in a lawsuit by which the court allows a third person who was not originally a party to the suit to become a party, by joining with either the plaintiff or the defendant. , called unsupervisedlearning Unsupervised learning is a method of machine learning where a model is fit to observations. It is distinguished from supervised learning by the fact that there is no a priori output. In unsupervised learning, a data set of input objects is gathered. , (which is not the case for MLFF). This is useful since itallows questions to be regraded as students use it. It should be noted,however, that the Kohonen neural network is a mathematically weakerneural network than the MLFF. In practice this means that it takeslonger to train it and it is less likely to perform the correct mappingwhen the input data is noisy Noisy is the name or part of the name of six communes of France: Noisy-le-Grand in the Seine-Saint-Denis d��partement Noisy-le-Roi in the Yvelines d��partement Noisy-le-Sec in the Seine-Saint-Denis d��partement (Hagan, Bemurth, & Beale Beale is a surname, and may refer to: Anthony Beale Charles Lewis Beale Dorothea Beale Edith Bouvier Beale Edward Fitzgerald Beale Fleur Beale Geoff Beale Howard Beale (fictional character) , 1996). It istherefore important to establish whether the input data is noisy andthat the Kohonen neural network does produce an acceptable solution inan acceptable period of time. This was the subject of experimentsdescribed later in this article. The Kohonen neural neural/neu��ral/ (noor��al)1. pertaining to a nerve or to the nerves.2. situated in the region of the spinal axis, as the neural arch.neu��raladj.1. network's ability to learn without humaninteraction is useful, since the distinction between a novice studentand an expert student, in terms of marks at tutorials, may be small, ormay vary significantly from domain to domain. For example, the majorityof students may achieve marks between 50% and 60%, with a few resultsbetween 60% and 75% percent and a few between 40% and 50%. There aretherefore, two large ranges of numbers that occur infrequently in��fre��quent?adj.1. Not occurring regularly; occasional or rare: an infrequent guest.2. (0-40 and75-100). If a MLFF neural network were to be used to model thepreviously described problem then these mark ranges must be identifiedbeforehand or the neural network's outputs would have to bedesigned to produce student ability levels between 0 and 100, toaccommodate a generic range of scores. Identifying scores beforehand isnot likely to be practical since it would require the collection of alarge amount of data (with no initial benefit for the student).Designing a generic neural network also introduces the followingdifficul ties. The MLFF neural network must have enough ability levels(outputs) to clearly demonstrate the distinction between students inthese highly clustered areas, necessitating an increase in outputs forall areas to cover for all possibilities, even those that are unlikely.The increase in outputs renders the neural network more complex,resulting in a network that is more difficult to train. Furthermore, andmost crucially, once the trained MLFF neural network is used fordifferent domains, then there is no direct correspondence between anability level for one domain and an ability level for another. This isbecause an output of the MLFF does not correspond directly to a studentability level since the student ability level may vary between domains.This can be seen in Figure 1. In Figure 1, Domain A (MLFF) and Domain B (MLFF) activate differentoutputs since domain A and domain B have different ranges of scores.This results in different outputs of the MLFF neural network becomingactive to represent each ability level. Since the outputs of the neuralnetwork are connected to a system, which is used to determine how manylinks to offer to a student, it becomes necessary to manually define,which active output should be associated with each student abilitylevel. By contrast, the Kohonen neural network automatically adjusts itsoutputs to match the ranges of student marks presented to it, since ituses unsupervised training. It therefore does not require any manualintervention and can be designed with fewer outputs since each outputdirectly corresponds to a student ability level. This advantage isconsiderable and warrants the investigation of the Kohonen architecturefurther irrespective of irrespective ofprep.Without consideration of; regardless of.irrespective ofpreposition despitethe literature claims that it is a weakerclassifier than the MLFF. A Kohonen network can solve the problem by continually adapting toinput stimuli while it is being used by students. This is because of theway a Kohonen network operates. A Kohonen network is given a number ofoutputs by the network designer, representing the number of categoriesthat the network designer wishes the network to identify (the number ofrequired student levels). It is left to the network itself to sort theinput data into this number of categories, since it is not implied bythe training data itself, as would be the case for a MLFF, in that aMLFF requires an example solution with its training data. If there are10 or more distinct patterns in the data then a correctly trainedKohonen will learn by itself to distinguish them (Kohonen, 1989; Fine,1999). Note that the situation means that the Kohonen neural networkgrades a student population according to all the populations it hasencountered since it is always learning, whereas a MLFF neural networkalways grades a population according to the population it was originallytrained with. This situation may not be desirable in certaincircumstances CIRCUMSTANCES, evidence. The particulars which accompany a fact. 2. The facts proved are either possible or impossible, ordinary and probable, or extraordinary and improbable, recent or ancient; they may have happened near us, or afar off; they are public or . For example, the system may be used by a final yearstudents and then by first year students. It is expected that the finalyear students will be of a higher level than the first year students.However, since these populations of students use Hypernet separately theTS will attempt to adapt to them separately. If, for example, the finalyear students use Hypernet before the first year students, then the TSwill adapt to the final year students and be unable to grade the firstyear students, since they will be all grouped in the low levels. Inpractice, this situation is not likely to arise because the networkrequires hundreds of interactions to adapt fully. However , the systemcan automatically gather student data and retrain re��train?tr. & intr.v. re��trained, re��train��ing, re��trainsTo train or undergo training again.re��train periodically. Enablingand disabling dis��a��ble?tr.v. dis��a��bled, dis��a��bling, dis��a��bles1. To deprive of capability or effectiveness, especially to impair the physical abilities of.2. Law To render legally disqualified. the Kohonen neural networks ability to adapt can beaccomplished by disabling the learning part of the Kohonen algorithm algorithm(ăl`gərĭth'əm)or algorism(–rĭz'əm)[for Al-Khowarizmi], a clearly defined procedure for obtaining the solution to a general type of problem, often numerical. . TRAINING DATA The inputs to the Kohonen neural network must incorporate historydata to make a more informed evaluation of the student and thereforeavoid a restriction of Bergeron et al.'s (1989) neural network,namely reacting to a one off error (or success) from a student. Historydata can be used to prevent the TS from making snap judgements on thestudent. For example, if the student is generally performing well, butgets one question wrong, then if no history data is taken into accountthe TS is forced to make a decision based only upon the most recentpresentation and the student is likely to drop a level. The studentability itself represents a degree of history data, in that if a studentis regarded as an expert student, they must have performed well in thepast. However, the direct incorporation of history data prevents acontinual changing of levels based upon one interaction only. Theincorporation of history data can be achieved by presenting a number ofprevious interactions with tutorial nodes to the neural network. Eachtime a new interaction is presented the previous interactions areshifted along the inputs to accommodate the new input and the oldestinteraction is lost. Training data supplied to the neural network arefigures that represent a percentage value of a student'sinteraction with a tutorial. For example, if the student achieved a 50%success level with a tutorial question, then it is this figure that ispassed to the TS. A program was devised to generate training data for the neuralnetworks. The program generates simulated student scores based uponseveral parameters described. The parameters allow the neural networkdesigner to cluster actual student levels into groups. This allows thetesting of the hypothesis that the Kohonen variant variant/var��i��ant/ (var��e-ant)1. something that differs in some characteristic from the class to which it belongs.2. exhibiting such variation.var��i��antadj. can adapt todifferent clustering of student results and hence different domainssince a network can be trained with data clustered in one way and thentested to ascertain whether it can adapt to data clustered in anotherway. Each line of input data presented represents the time-shifted data.The number of items on each line is variable to allow the design ofneural networks with varying numbers of time series inputs, to determineexperimentally, which produce acceptable results. The input parameters to the Training Data Generator generator,in electricity, machine used to change mechanical energy into electrical energy. It operates on the principle of electromagnetic induction, discovered (1831) by Michael Faraday. are the numberof time-shifted inputs, the number of outputs (student levels), and thenumber of test cases to be generated. Further inputs allow the tailoringof the actual data, including the minimum and maximum values to produce(to represent ranges of scores, for example most scores may be limitedto between 30 and 70%) and a deviation DEVIATION, insurance, contracts. A voluntary departure, without necessity, or any reasonable cause, from the regular and usual course of the voyage insured. 2. from the current value. Thedeviation limits the next value to be within the deviation from thecurrent value. The data may be split into several virtual students bythe student value, this simply generates a new value that is notdependent upon the deviation, to simulate simulate - simulation a different student using thesystem. Further details of the training data generator program can befound in Mullier (1999). An example output of the Tutorial Supervisor Training Program is:43 46 46 46 46 46 5 new student39 43 46 46 46 46 440 39 43 46 46 46 5 The first six values in any row represent the time-series input tothe TS. The final value represents a guide student level based upon themean value of all inputs and this figure may be used by the researcherto make a quick evaluation of the line of data. The "newstudent" indicates that the general trend in scores has beenrandomly reset to represent another student using the system. Forexample, when a score is generated for a student, this score is betweenthe minimum and maximum scores. The next score is then calculated as arandom deviation value plus or minus the previous score, unless anoutlier outlier/out��li��er/ (out��li-er) an observation so distant from the central mass of the data that it noticeably influences results. outlieran extremely high or low value lying beyond the range of the bulk of the data. score is randomly signalled, in which case a completely randomvalue is generated. After an outlier score has been generated then thenext score is again dependent upon the previous trend. This continuesuntil a "new student" is generated, in which case a randomvalue is generated, ignoring the previous trend (like an outlier) andall subsequent scores (with the exception of outliers) are based uponthis scor e. THE KOHONEN TUTORIAL SUPERVISOR EXPERIMENT DESIGN The experimental Kohonen neural networks were defined in line witha strategy devised by Masters (1993). It is not usually possible toexhaustively ex��haus��tive?adj.1. Treating all parts or aspects without omission; thorough: an exhaustive study.2. Tending to exhaust. test every possible combination of neural networkparameters. Instead, it is advisable ad��vis��a��ble?adj.Worthy of being recommended or suggested; prudent.ad��visa��bil to: * Define the number of inputs, as driven by the problem itself. * Define the number of outputs, as driven by the problem. * Use a standard number of epochs (number of training examplespresented to the network), and increase only if a network does notconverge con��verge?v. con��verged, con��verg��ing, con��verg��esv.intr.1. a. To tend toward or approach an intersecting point: lines that converge.b. on a solution. * Use a standard training and test set. Change only if requiredspecifically by the problem. Specifically for this problem the following was undertaken todetermine the best configuration. The number of inputs was steadilyincreased to examine whether or not a successful network could bedefined for instances where a large number of history cases arerequired. The number of outputs was increased to test for cases where alarge number of student levels are required. The number of time-shifted inputs is crucial to the grading ofstudents using a Kohonen network since it is not being provided with anexpected level for the student. Instead it must work it out for itself.The inputs have equal priority, that is, the network does not explicitlyknow that the first input, being more recent, is more important than thelast output. Therefore, data presented to the neural network, both realand simulated, provides a temporal Having to do with time. Contrast with "spatial," which deals with space. pattern, in that a value appears atone input and then travels along the inputs to the last input, afterwhich it disappears. For the neural network to be able to take this intoaccount, two methods were employed for extracting training data from thesimulated data. Original order extraction extraction/ex��trac��tion/ (eks-trak��shun)1. the process or act of pulling or drawing out.2. the preparation of an extract. presents training material tothe neural network in the original sequential One after the other in some consecutive order such as by name or number. order, so the time-serieselement is preserved. The second method is to randomly extract thetraining data, so each training item appears in isolation and is notrelated to the training item befo re or after. A Kohonen is given a number of outputs that represent the number ofcategories that it must separate its input data into. It is possible todetermine that a Kohonen has been trained to a sufficient level if ithas activated activateda state of being more than usually active. In biological systems this is usually brought about by chemical or electrical means. Commonly said of pharmaceutical and chemical products. all its outputs with the training data to the same degreethat the patterns occur in the training data. This may be represented bya pie chart A graphical representation of information in which each unit of data is represented as a pie-shaped piece of a circle. See business graphics. as the Kohonen network is training, each segment of thepiechart being an output level for one of the outputs, making itpossible to see how much each output has been activated and to seewhether all outputs have been activated. EXPERIMENTAL METHOD Experimental parameters were varied: The number of inputs: It is necessary to determine how many inputswere required to give a balanced level for the student, whilemaintaining as simple neural network as possible. This is not astraightforward matter, since the more inputs the neural network has themore history data is presented to it, which will directly affect itsoutput. It is sensible to identify a range for the number of inputs thatproduce successful neural networks. This range of neural networks maythen be employed in trials with real student data to determine whichis/are the best. These trials were beyond the scope of the currentproject. The number of outputs: The number of outputs represents the numberof student levels that the TS can grade a student into. The number ofoutputs also affects the complexity of the neural network (more complexnetworks, i.e., with more outputs, are less likely to converge on anacceptable solution and/or and/or?conj.Used to indicate that either or both of the items connected by it are involved.Usage Note: And/or is widely used in legal and business writing. require more training). This parameter (1) Any value passed to a program by the user or by another program in order to customize the program for a particular purpose. A parameter may be anything; for example, a file name, a coordinate, a range of values, a money amount or a code of some kind. istherefore vital since it is not only driven by the necessity for asimple network design, but by the necessities of the TS itself (how manylevels it can offer to the system as a whole). Ten students levels weredefined as a reasonable number on the grounds that this provides enoughdistinction between a novice and an expert student, although theexperiments were designed to identify a range (by changing the number ofoutputs), should there be a need for different amounts. The number of training epochs: This is how many data items arepresented to the neural network during training. For complex networks(with a large number of input and output neurones) this is vital since acomplex neural network may memorise Verb 1. memorise - commit to memory; learn by heart; "Have you memorized your lines for the play yet?"memorize, con, learnunderstudy, alternate - be an understudy or alternate for a role its training data and lose itsability to generalise v. 1. same as generalize.Verb 1. generalise - speak or write in generalitiesgeneralizemouth, speak, talk, verbalise, verbalize, utter - express in speech; "She talks a lot of nonsense"; "This depressed patient does not verbalize" with other data, a phenomenon known as"over-fitting" (Masters, 1993). For Kohonen neural networks,it is possible to determine when all outputs have become activated tothe desired degree and then stop the training. This can be achieved byexamining the output level, by examining the piechart, and determiningwhether it is producing a similar value to that of the known (because itwas generated) composition of data. The training set extraction method: Two types of extraction methodwere employed, random extraction and original order extraction. Sincethe data provided to the neural network represents time-series data itwas considered important to present the data in its original order, sothat the time element is not removed between examples. To test thesignificance of this, a random extraction method was used as a control. The domain data type (limiting values to certain areas) (denoted by"Range Train" and "Range Test" in the followingresults table): The neural networks were trained with various dataclusters to determine to what degree they were able to differentiatebetween them. A neural network trained with data clustered in one areawould then be tested with data clustered in another area to determinewhether or not they were able to adapt to the new data. For example, anetwork may be trained with data that is clustered around 30-75%, itwould then be retrained with data clustered in another region, 40-90%.The columns in the results tables that demonstrate this are "RangeTrain" and "Range Test." The "Range Train"column represents the range of scores that the neural network wastrained with and the "Range Test" column represents the rangeof data that the neural network was tested with once it had been trainedwith the training data.. Different ranges in Range Test and Range Traincan be used to determine whether the neural network can adapt todifferent domains where the range of scores is different. EXPERIMENT RESULTS To determine if the neural networks were successful or not it wasnecessary to train them with a known composition of data. For example,if 10 categories of data of equal size were presented to a neuralnetwork then it should activate 10 outputs to roughly the same degree.This was possible since the data was generated. For Kohonen unsupervised learning architectures it is not possibleto assign a definite success value, since the Kohonen architecture isnot trained by supervised learning Supervised learning is a machine learning technique for creating a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. , in that it is not provided with adesired result and is not told whether it is right or wrong. This is notto say in all cases that a Kohonen neural network's results cannotbe graded against results obtained by another means. If results could beobtained by another means (i.e., a training set and test set can beconstructed with results data, but then the trained neural network isused on its own) then a similar comparison can be made. In the case ofthe TS, it is not possible to generate accurate student levels with thegenerated input data. This is because of the special cases, such as theinclusion of history data that may contain outliers that renders theclassification complex. The only way to judge the performance of theKohonen in this case is to train the network with training data of aknown composition, in terms of a statistical guide as to the ratio ofpatterns within it and to determine that the outputs are activated to asimilar degree. This method was used for the experiments described inthe following sections. SUCCESS CRITERIA The result tables show the configurations of Kohonen neuralnetworks that did and did not produce a successful result. Determining asuccessful result was based on an examination of the numerical data Numerical data (or quantitative data) is data measured or identified on a numerical scale. Numerical data can be analysed using statistical methods, and results can be displayed using tables, charts, histograms and graphs. produced by the neural network and a comparison was made against thedata previously generated, which was of a known composition. Success isdenoted in the following results tables as a "Yes" in thesuccess column for a successful neural network and a "No" ifthe neural network was not successful. The Kohonen architecture producesreal number values, in most cases the output with the highest value isdesignated the "winner" and all other values are ignored (forexample, the output representing level 6 produced the highest outputactivation activation/ac��ti��va��tion/ (ak?ti-va��shun)1. the act or process of rendering active.2. the transformation of a proenzyme into an active enzyme by the action of a kinase or another enzyme.3. , therefore the student is graded as level 6). This approachis not suitable for determining the success criteria of our TS, since anoutput may be designated the winner with a weak output (low real value),simply because all the other outputs are weaker. A network that produces a weak output is less likely to produce consistent results givensimilar data presentations (Skapura, 1996). The relative sizes of thereal values at each output can be interpreted as certainty factors. Itis reasonable to select networks that are "certain" of theirresults (Masters, 1993). Reasons for networks not producing high levelsof certainty can be attributed to unclear input data (which would affectall networks), or insufficient training times (epochs). Once a networkwas defined as being certain, in that it is outputting a similar valueto the known composition of data, then its actual outputs were examinedto determine whether it was making a reasonable distinction. This wasachieved first by ensuring that the outputs were active to a similardegree to the amount of each category in the input data. For example,since the composition of the training data is known (e.g., it is 25%level 6, 35 % level 5, etc.) then there should be a direct correlation Noun 1. direct correlation - a correlation in which large values of one variable are associated with large values of the other and small with small; the correlation coefficient is between 0 and +1positive correlation with the relative outputs of the network. Secon d, exception data (thestudent performed poorly for one input when the rest were generallygood, for example) was then tested with the networks to determinewhether they were making a reasonable distinction. This was taken to bemovement of one level for each item of exception data. This fits wellwith the pedagogy proposed by Bergeron et al. (1989). The results tables show the various configurations of neuralnetwork that were applied to the problem of generating an efficient TS.A discussion of the table follows. DISCUSSION OF RESULTS * Experiments 1,2, and 3's five inputs produced networks thatidentified a solution with between 5 and 30 outputs. Experiment 4continues the trend of increasing the number of outputs; however thisconfiguration produced a weak neural network. However, experiment 11,which has the same number of outputs as experiment 4, does produce asuccessful neural network. The difference between experiments 4 and 11is that the number of inputs is increased to 10 from 5 in experiment 11.The likely reason as to why increasing the number of inputs produces asuccessful network is that there is enough information being presentedto the network for it to make a distinction into the larger number ofstudent levels. However, a potential difficulty could arise in that therange of data presented to the network might in reality not representsuch a range of student levels. In this case the neural network islikely to be making other determinations to generate the range ofoutputs that are not consistent with the problem in hand . For example,the neural network may begin to identify certain exceptions to thecurrent trend as levels in themselves. The problem of a neural networkmaking unwanted or unforeseen determinations is a common problem. Aclassic example, often used to highlight the problem, was a system foridentifying enemy tanks. The system worked well on the test data (datafrom the same population as the training data but not used in training),but failed totally with new data. It transpired that the trainingpictures of tanks were taken on a sunny day and the training pictureswithout tanks were taken on a cloudy cloudy(clou��de)1. murky; turbid; not transparent.2. marked by indistinct streaks. day and the network has learned todistinguish weather (Skapura, 1996). The problem of the tank system wasa training data problem, in that the problem would not have arisen ifthe training data had included pictures of tanks on both cloudy andsunny days. However, it does highlight the problem with connectionistsystems that the designer is not in direct control of the task at hand. * Experiments 4, 5, and 12 reinforce the conclusion drawn that alarge discrepancy DISCREPANCY. A difference between one thing and another, between one writing and another; a variance. (q.v.) 2. Discrepancies are material and immaterial. between the number of inputs and the number of outputsproduces networks that do not produce acceptable results. Experiment 12demonstrates that this is not dependent upon the number of inputs (since10 are used) and experiment 5 demonstrates that training the network forlonger makes no difference. * Experiments 6 and 7 demonstrate that a network is capable oflearning to produce the desired number of output levels for a limiteddata range and that it is able to adapt thereafter to data in otherranges. This is a crucial point in that it strongly indicates that theneural network will be able to adapt to different domains without manualintervention. Experiment 6 shows that the network can still identify thestudent levels with the student scores being limited to a smaller range.Further, experiment 7 shows that once a network has been trained withone range it can still adapt to another range. * Experiments 8, 9, 10, and 11 demonstrate that larger network with10 inputs work well. These experiments were mainly used to test thehypothesis that a large imbalance imbalance/im��bal��ance/ (im-bal��ans)1. lack of balance, such as between two opposing muscles or between electrolytes in the body.2. dysequilibrium (2). between the number of inputs andoutputs produced untrainable networks (previously described). However,it is doubtful whether this large number of inputs would be required fora practical tutorial supervisor. This is discussed further in thediscussion section of this chapter. * Experiments 12, 13, and 14 all produce unacceptable networks,this is because there are too many levels for the network to producediscrete outputs. Since the data presented to the network falls in therange of 0-100, it is not reasonable to expect the network to classify clas��si��fy?tr.v. clas��si��fied, clas��si��fy��ing, clas��si��fies1. To arrange or organize according to class or category.2. To designate (a document, for example) as confidential, secret, or top secret. categories that are only composed of two or three numbers. * Experiments 15 to 20 and experiment 5 demonstrate that theoptimum number of training epochs is 100, since too few and the networksdo not converge (16, 18, 20). Experiments 17 and 18 demonstrate that anetwork may learn to classify a data range in a low number of epochs butdo not quickly adapt to a new data range. Experiments conclude that anetwork will adapt to a new data range within 100 epochs, that is, afterone or more students have completed 100 questions or tutorials.Experiments not documented demonstrated that using more than 100 epochsdid not affect training, for this problem at least. In practice it couldbe seen by examining the network outputs while they were training, thatthey had converged upon a solution before the 100th epoch, and that anyfurther training did not change the network. The results of the Kohonen neural network architecture were highlysuccessful, a simple Kohonen neural network was able to separate itsinput data into the number of predefined student groups, that is, theywere able to activate all their outputs to the desired degree. A networktrained on one set of data, data clustered into a certain range, wasable to adapt quickly to data clustered differently. This is animportant point, since this allows the network to adapt to variousdomains. It is this facility that the MLFF neural network cannotaccomplish. A drawback DRAWBACK, com. law. An allowance made by the government to merchants on the reexportation of certain imported goods liable to duties, which, in some cases, consists of the whole; in others, of a part of the duties which had been paid upon the importation. of the Kohonen neural network architectureidentified, however, was the difficulty in determining which output isassociated with each student ability level. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"put differently , if there are10 outputs it is not necessarily the case that the first inputrepresents level one and the second level two and so on. The Kohonenneural network is given the number of categories it should separate theinput data into (the number of outputs) and it is left to the Koh onenneural network algorithm to determine which output belongs to whichcategory. This is not accomplished in a rule-based fashion, but instead,the outputs tend to evolve towards a particular solution, dependent uponthe initial random start weights and the possibly random order ofexamples. For most cases, this evolution of outputs did produce agraduation Graduation is the action of receiving or conferring an academic degree or the associated ceremony. The date of event is often called degree day. The event itself is also called commencement, convocation or invocation. of output categories, that is, level 2 is next to level 1,and so forth. This can be explained by the fact that level 1 is moresimilar to level 2 than it is to level 10. However, level one sometimesappeared as the first input and sometimes as the last input. It wasstraightforward to determine which outputs were related to whichcategory, since in the vast majority of cases the inputs occurredsequentially se��quen��tial?adj.1. Forming or characterized by a sequence, as of units or musical notes.2. Sequent.se��quen . THE NUMBER OF INPUTS The number of inputs is the amount of time series data to bepresented to the neural network. The practical limit for inputs isdependent upon the training data and the number of outputs. It wasexperimentally discovered that a large imbalance between inputs andoutputs (many more outputs than inputs) produced networks that did notconverge on a solution. Note that the reverse case (many more inputsthan outputs) is always likely to produce a strong network, since it isbeing provided with a wealth of input and only asked to separate it intorelatively few categories. However, these configurations were notdesirable for this project for reasons described, namely too muchhistory data puts too much weight on information that may be out ofdate. These are issues related to the domain and not necessarily thenetwork architecture itself. It is a key factor that the number ofinputs reflects the correct solution to the problem and there must besufficient inputs with which to make an accurate grading of the student.Therefore, it was necessary to test a range of input and outputconfigurations that could be of use to several problem areas. This isnecessitated by the fact that the number of time shifted inputs may bedriven by the domain in question. Thus, there may be many questions overa short period, or fewer questions over a short period. If the number ofquestions is small then it may not be desirable to provide a wealth ofhistory data, since the oldest inputs may represent the student'sstate of knowledge an educationally long time ago. Similarly, if thestudent interacts with a large number of questions over a short period,then it may be desirable to provide the network with more time-seriesdata, to give an adequate spread over time. This is a fundamentalproblem, since different domains are directly driving the number ofrequired inputs to the neural network and hence negating the domainindependence of the Tutorial Supervisor. A possible solution to thisproblem is to design a "jack-of all trades" neural network, inthat it is able to deal with the maximum number of inputs and then touse only a selected few inputs for unique data items should the domainnot require as much history information. This approach would howeverneed further investigation with real student data to test its validity. THE NUMBER OF OUTPUTS Changing the number of outputs did not affect the successfultraining of the neural network. Generally, the more outputs the morecomplex the network and the longer it takes to train. Larger networksalso require more training data, the learning is spread over moreneurones. The upshot of this is that the networks require more time totrain and more processing power to execute. The number of outputs represents the number of levels.Experimentally it was determined that 20 outputs were a practical limitfor a neural network with between 5 and 10 inputs, since it can be seenfrom the results table, experiments 5 and 12, that successful neuralnetworks were rendered useless by applying more outputs (more than 20). THE NUMBER OF EPOCHS This is the number of times training items are presented to theneural network. The value of 100 was found to produce a neural networkthat converged on a solution, in terms of all the outputs of a neuralnetwork becoming excited by at least one pattern within the trainingdata. However, the fact that a neural network converged upon a solutiondoes not necessarily mean that the solution is an acceptable one. Thenetworks were examined to determine whether they were returning thecorrect number of levels presented to it in the input data, thosenetworks termed successful did return the correct number of levelspresented (as determined by examining if all the outputs had becomeactive to the correct level). DATA EXTRACTION Data extraction is the act or process of retrieving (binary) data out of (usually unstructured or badly structured) data sources for further data processing or data storage (data migration). The network produced good results even if the training data was notsupplied to it in the original order. This indicates that the networkwas not acting as a memory; it was not remembering the previous inputand using it to generate a new output. The network does not learn toprioritise Verb 1. prioritise - assign a priority to; "we have too many things to do and must prioritize"prioritizegrade, rate, rank, place, range, order - assign a rank or rating to; "how would you rank these students?"; "The restaurant is rated highly in the food its inputs. For example, a student who has given one badanswer and five good answers will be graded the same, no matter when thebad result occurred. This results in the student's level beinginfluenced by a bad result until such a result is no longer presented tothe neural network. The effect is dependent upon the difference betweenthe good results and the bad result. A large difference will make thestudent drop several levels and keep them at that level for an amount oftime (the time being dependent upon the number of results beingpresented to the network and the time taken for the student to answerother tutorials with a higher degree of success). The consequence ofthis is that the number of inputs has a direct pedag ogical influence,as well as significance in determining the neural network'sabilities. Fewer inputs and, regardless of the neural network'sabilities, the progression from one level to another will be muchquicker than if there were more inputs. A constantly changing abilitylevel may be distracting dis��tract?tr.v. dis��tract��ed, dis��tract��ing, dis��tracts1. To cause to turn away from the original focus of attention or interest; divert.2. To pull in conflicting emotional directions; unsettle. to the student. LIMITING VALUES The advantage of the Kohonen neural network unsupervised learningarchitecture is that it is able to continue learning while it is in usewithout intervention from a person. As discussed earlier, this may benecessary if the range of answers from a student population falls withina limited area, since the limited areas may be different for variousdomains. The networks were therefore trained on data to mimic thissituation. It was found that the networks could converge on asatisfactory solution for a clustering of data where the number ofoutputs was valid (a distinction can be made between one output leveland another). For example, the networks were trained with a data setlimited to values between 40 and 70% as quickly as for data between 0and 100%. If the data ranges were set to more restricted values (between50% and 60% say), then a network with 20 output levels would notconverge on a solution. This was to be expected, since there are only 10discrete possible values, while the network is trying to c lassify 20clusters. It was also found that a neural network could be trained tocategorise Verb 1. categorise - place into or assign to a category; "Children learn early on to categorize"categorizereason - think logically; "The children must learn to reason" data correctly over various cluster types, that is, all thelevels (outputs) were activated for a given cluster to the degree towhich the categories were present within the training data. A neuralnetwork previously trained with one set of clustered data could beeasily retrained with a differently clustered data set. These resultsdemonstrate that in practice, the neural network can continually adaptto its student population. Figure 2 shows three pie charts representing the results from aKohonen neural network with 5 inputs and 10 outputs. The first pie chartshows a neural network that was trained with student data ranging from 0to 100. The second pie chart shows the same neural network after it hasbeen allowed to adapt to data in the range of 40 to 70. The middle piechart shows the network before it has adapted to the new domain data.The two large chunks represent the outputs of the network that are nowrepresenting what was formally one level but are now in effectrepresenting many levels. Therefore, for the TS to remain valid anduseful it must adapt to the new domain. This adaptation takes place inthe same number of epochs as the original training, that is, 100 epochs.The size of the segments represents the activation of each output. REGRADING Regrading is the process of raising and/or lowering the levels of land; such a project can also be referred to as a regrade. Regrading may be done on a small scale (as in preparation of a house site)[2] QUESTIONS USING FUZZY LOGIC fuzzy logic,a multivalued (as opposed to binary) logic developed to deal with imprecise or vague data. Classical logic holds that everything can be expressed in binary terms: 0 or 1, black or white, yes or no; in terms of Boolean algebra, everything is in one set or Each question level is generally presented to a student of the samelevel, or just below, a pedagogy used with success by Bergeron et al.(1989), in that a level x student should be able, overall, to answer alevel x question. A question may however, be graded incorrectly by thedomain author. This can be determined by the system after a number ofinteractions with different students (a population of students whoshould be, generally, getting a question right are getting it wrong orvice versa VICE VERSA. On the contrary; on opposite sides. ). However, it is not suitable to immediately regrade aquestion with respect to an interaction with one student. As has beendiscussed earlier, a student is a complex entity and it is difficult toformulate formulate/for��mu��late/ (for��mu-lat)1. to state in the form of a formula.2. to prepare in accordance with a prescribed or specified method. rules describing them accurately. To resolve this problem,each question level is modelled as a fuzzy set Fuzzy sets are sets whose elements have degrees of membership. Fuzzy sets have been introduced by Lotfi A. Zadeh (1965) as an extension of the classical notion of set. In classical set theory, the membership of elements in a set is assessed in binary terms according to a bivalent . This allows aquestion's level to be adjusted slightly, within the level, withoutnecessarily affecting the overall level (as presented to the student),thus there is a buffering Downloading the first block of data. In streaming media, buffering refers to bringing in an extra amount of data (filling the buffer) before playing the audio or video. Having more audio data or video frames in memory than are actually needed at each precise moment compensates for effect and the question does not rapidly leapback and forth between levels. The use of fuzzy sets also provides amechanism for allowing a question to belong to more than one questionlevel set, providing a smoother transition between levels. This differsfrom Bergeron, et al.'s (1989) approach in that they collected datafrom the students and then periodically use it to update the training ofthe neural network. Therefore there is a delay, which ensures that thequestion levels do not suddenly change, which could potentially resultin the question level continually changing and thus be distracting tothe students. However, the drawback is that this is a manual processthat requires the direct intervention of the system designer. Theprocess for regrading questions described later is achievedautomatically, while still maintaining the delay between the questionbeing presented to a student and changing its level. Figures 3 and 4 show fuzzy sets overlapping with each other. Thefirst diagram diagram/di��a��gram/ (di��ah-gram) a graphic representation, in simplest form, of an object or concept, made up of lines and lacking pictorial elements. shows that a question with a value of x is graded as bothlevels two and three. The second diagram demonstrates that if the fuzzysets are stretched, then a question with the same value of x has amembership of all three question levels. This is an appropriatesituation since it may not be a simple matter to assign a question toone particular level. One student may find a particular question moredifficult than another student of the same level, due to some differencein previous knowledge. If a question is near the border of levels thenit may appear as either level, that is, it may be offered to students ofboth levels. The degree to which this multiplicity mul��ti��plic��i��ty?n. pl. mul��ti��plic��i��ties1. The state of being various or manifold: the multiplicity of architectural styles on that street.2. of questions occursdepends upon how the fuzzy sets are defined. A question is regraded by a population of students'interactions with the question being determined as incorrect by the TS,for the reasons previously described. Such erroneous erroneousadj. 1) in error, wrong. 2) not according to established law, particularly in a legal decision or court ruling. interactions causethe question ability to move within the fuzzy set until it crosses intoa different fuzzy set. A question's ability level is therefore only changed after anumber of erroneous interactions with students, the actual number beingdependent upon the size of the fuzzy set, the fuzzy set is thereforeacting as a buffer buffer,solution that can keep its relative acidity or alkalinity constant, i.e., keep its pH constant, despite the addition of strong acids or strong bases. . A simple fuzzy fuzz��y?adj. fuzz��i��er, fuzz��i��est1. Covered with fuzz.2. Of or resembling fuzz.3. Not clear; indistinct: a fuzzy recollection of past events.4. processor accomplishes questionregrading. The fuzzy processor compares the level of the currentquestion and the student level output of the TS neural network. Thebuffer can be implemented using three fuzzy rules (Kosko, 1996): 1 IF S_LEVEL > Q_LEVEL THEN Q_LEVELf = Q_LEVELf + 1 2 IF S_LEVEL < Q_LEVEL THEN Q_LEVELf = Q_LEVELf - 1 3 IF S_LEVEL = Q_LEVEL THEN Q_LEVELf = 0_LEVELf (remain unchanged) where S_LEVEL is the level assigned as��sign?tr.v. as��signed, as��sign��ing, as��signs1. To set apart for a particular purpose; designate: assigned a day for the inspection.2. to a student Q_LEVEL is the level of the question, used to decide whether it issuitable for the student. Q_LEVELf is the fuzzy membership number of the question. Using Figure 5, a question may belong to one or two of four levels.If the question has a value of Q_LEVELf corresponding to x, then thequestion is regarded as both level three and level four. If however, asa result of interactions with several students, rule one is repeatedlyfired, then the value of Q_LEVELf will increase and the question willbecome graded as level four only. Conversely con��verse?1?intr.v. con��versed, con��vers��ing, con��vers��es1. To engage in a spoken exchange of thoughts, ideas, or feelings; talk. See Synonyms at speak.2. , if rule two is repeatedlyfired then the value of Q_LEVELf will decrease and the question will begraded as level3 only. Such changes may result in further increases ordecreases in the value of Q_LEVELf, which may become any of the levelsavailable. The utilisation of this fuzzy system ensures that the levelof a question is not changed (in terms of presenting to students) inresponse to individual interactions with students. The fuzzy logic isable to distinguish overall trends and is therefore robust in thepresence of exception data. The changing of levels is dependent upon the size of the fuzzysets, since the size of the fuzzy set directly affects how manyinteractions are required before a question migrates from one level toanother. These start and end points also define whether or not aquestion can belong to more than one level. It is intuitively sensibleto provide a small overlap o��ver��lapn.1. A part or portion of a structure that extends or projects over another.2. The suturing of one layer of tissue above or under another layer to provide additional strength, often used in dental surgery.v. of neighbouring neighbouringor US neighboringAdjectivesituated nearby: the neighbouring islandneighbouringU.S. fuzzy sets. This provides asimple buffer that will help the question to find its true grade. Forexample, if fuzzy sets were to be defined separately (or as a discreteset) then a question may be presented as a difficult level but becomegraded as an easier level after several interactions. Once this changehas occurred then the question is no longer presented to the originalclass of students that graded it. This may result in the questionbecoming stuck at a particular level. If instead the question ispresented to both the original class of student and the new class ofstudent, then a smoother progression from one level to another mayresult and the extra information enables the question level to settlemore easily. If the fuzzy sets are defined so that a question may belongto more than two levels, then the question level may not settle at all,as one level of student may push the value of Q_LEVEL one way andanother level may push it in the opposite direction. RESULTS CONCLUSION The Kohonen neural network proved to be a successful neural networkarchitecture for the problem of grading students into ability levels.Most permutations of parameters produced neural networks that convergedupon a solution. A fast and reliable neural network could be producedwith between 5 and 20 inputs or outputs. It is possible to increase thisnumber, but this is unlikely to be required, since it is not desirableto use information from too far in the past, since the network does nothave any knowledge of time. It has been experimentally determined thatthe number of outputs should not rise above 20. If more levels arerequired then the outputs may be combined to form fuzzy sets. Parameters for a generic Kohonen Tutorial Supervisor, that is, onethat will converge on a solution and provide a high degree of studentgrading for a variety of domains are suggested as the following: * 5 Inputs -- this has been found to provide the network withenough information with which to evaluate the student. It is suggestedthat an input filter, as previously described, is employed if more thanthis number of inputs is required. * 10 Outputs -- corresponding to 10 student levels, if more levelsare required then it is advised that 30 levels is the upper limit,beyond this the neural network becomes less likely to activate all ofits outputs. * Any form of data extraction may be used to form the training set,however a large number of examples are required to produce a fullytrained network. One thousand student interactions provide enoughexamples. Note therefore, that for a network to be trained with realstudent data is probably impractical im��prac��ti��cal?adj.1. Unwise to implement or maintain in practice: Refloating the sunken ship proved impractical because of the great expense.2. . However, since the Kohonen networkis fully adaptable a��dapt��a��ble?adj.Capable of adapting or of being adapted.a��dapta��bil , it is suggested that a network be trained upongenerated data and then allowed to adapt to real students. The generateddata could be designed to represent realistic but uncomplicatedsituations. For example, training the network to model simple rules suchas "If result in the range 40-50 THEN set student level to 5."The neural network is then able to adapt to any misconceptions presentin these rules. Once a neural network has been trained it may be savedand replicated. DISCUSSION A key issue of concern regarding the TS is the number of studentlevels that the TS is to recognise and output. Each student level shouldhave tutorial material generated for it; since it is important to targettutorial tasks at the student's ability, this is seen as being ofmore educational benefit than offering the same tutorials to allstudents and then assigning as��sign?tr.v. as��signed, as��sign��ing, as��signs1. To set apart for a particular purpose; designate: assigned a day for the inspection.2. a student level based upon the grade thatthe student achieves (Bergeron, 1989; Hartley, 1993), although there isno technical reason why the latter could not be done. The number ofstudent levels therefore may change between domains, since some domainsmay have a richer set of assessment questions than others (for a numberof possible reasons). A possible conflict therefore could arise betweenthe number of student levels that has been designed into the TS by thesystem designer and the number of student levels that are required bythe current domain author. A possible solution to this problem is forthe system designer to provide a TS that is c apable of outputting alarge number of student levels and then each domain author can allow itto adapt to their domains and ignore the inactive in��ac��tive?adj.1. Not active or tending to be active.2. a. Not functioning or operating; out of use: inactive machinery.b. outputs from the TSthat will naturally arise if there is not sufficient input studentlevels. The benefit of this approach is that one TS configuration couldbe used for many domains without the need for reconfiguration. However,the drawback is that some outputs of the TS will always remain inactive,although the experiments carried out as part of the researchdemonstrated that active outputs tend to cluster together and so areeasily identifiable. A problem related to the number of outputs is thenumber of inputs. The amount of history data presented to the TS directly affects thegrading of the student, in that the more history data presented to theneural network, the greater the effect of previous results withtutorials. This is a similar situation to that of the number of studentlevels, in that it is possible to design a TS with a large number ofinputs and then use only the required amount. However, it is not asimple matter to determine how much history data to present to theneural network in order to aid the student the most. This issue isdifficult to reconcile without extensive trials with real students andeven if this were done it would still be unlikely that any realconclusions could be drawn since proving the effectiveness ofeducational systems is notoriously no��to��ri��ous?adj.Known widely and usually unfavorably; infamous: a notorious gangster; a district notorious for vice. difficult in the educational field(Dillon Dillon may refer to: PeopleDillon (surname) Dillon is the given name of: Dillon Anderson (1906–1974) Dillon Bell (1822–1898), a New Zealand politician of the late 19th century Places & Gabbard, 1998). The purpose of the TS here is to explorethe technical issues relating to relating torelate prep → concernantrelating torelate prep → bez��glich +gen, mit Bezug auf +accthe feasibility of providing anautomatic student grading system. Whether or not this facility is usefulis open to educational debate. However, it is likely to be the case thatit will be useful should the correct set-up of the TS be achieved duringtrials with real students, since Bergeron et al. (1989) found their TSto be useful. Further issues arise concerning the adaptability a��dapt��a��ble?adj.Capable of adapting or of being adapted.a��dapta��bil of the neuralnetwork used for the TS. The neural network architecture used byBergeron et al. (1989) requires offline training and is therefore underthe control of the system designer. The drawback with this approach isthat it requires the manual intervention of a person who can interpretthe student interaction data with tutorials and determine whether itshould be represented to the neural network. The advantage of theKohonen neural network architecture is that it is able to traincontinually without any intervention from a human. However, there aresituations where this adaptation is undesirable, most notably whendifferent skill levels of students use the same domain at differenttimes. For example, if a class of first year students use the systemfollowed by a class of final year students. However, this is not aproblem if the questions and tutorials have been adequately assigned adifficulty level, since the first year students will only be offered easier tutorials and so can be graded only as lower level students(although they can still progress if they continue to achieve successwith the tutorial). Problems can arise only if both the studentabilities and the question difficulties are unknown beforehand. This isbecause the TS acts as a bidirectional The ability to move, transfer or transmit in both directions. mapping device, in that if eitherthe student abilities or the question difficulties are known beforehandthen the TS can produce the unknown parameter. It is not, however, ableto produce values when nothing is known beforehand. The TS'sability to regrade questions automatically is an exploitation of thisbidirectional mapping facility, in that the student ability can bechanged in response to improving results and the question difficulty canbe changed if a significant proportion of students who should get thequestion right in fact get it wrong. Research into the TS has demonstrated that a fully adaptable systemfor automatically grading students is possible and practical. Theapproach of using an automatic tutorial supervisor has been practicallyjustified by Bergeron et al. (1989). However, their system requiresmanual periodic retraining which renders it unsuitable for a generictutorial system At both University of Cambridge and University of Oxford, undergraduates are taught in the tutorial system. Students are taught by faculty fellows in groups of one to three. At Cambridge, these are called "supervisions" and at Oxford they are called "tutorials. , or a tutorial system that can be used without the needfor reprogramming Reprogramming refers to erasure and remodeling of epigenetic marks, such as DNA methylation, during mammalian development[1]. After fertilization some cells of the newly formed embryo migrate to the germinal ridge and will eventually become the germ cells or otherwise rearranging the program code of thesystem. CONCLUSION This article has described the design and training of the neuralnetworks for the TS. It was argued that the Kohonen neural network isthe best architecture for this problem, because of its ability to adaptto various domains without the intervention of a human. Experimentaldesigns for the Kohonen neural network were presented and discussed. Theability of the TS neural network to adapt to different domains and theability of the TS's fuzzy processor to adapt to questiondifficulties were explained. [FIGURE 1 OMITTED] [FIGURE 2 OMITTED] [FIGURE 3 OMITTED] [FIGURE 4 OMITTED] [FIGURE 5 OMITTED]Table 1Altering the number of inputs and outputsNetwork Inputs Outputs Epochs Range Train Range Test Success1 5 5 100 0-100 0-100 Yes2 5 10 100 0-100 0-100 Yes3 5 20 100 0-100 0-100 Yes4 5 30 100 0-100 0-100 No5 5 30 500 0-100 0-100 No6 5 10 100 40-70 40-70 Yes7 5 10 100 0-100 40-70 Yes8 10 5 100 0-100 0-100 Yes9 10 10 100 0-100 0-100 Yes10 10 20 100 0-100 0-100 Yes11 10 30 100 0-100 0-100 Yes12 10 40 100 0-100 0-100 No13 20 40 100 0-100 0-100 No14 10 40 500 0-100 0-100 NoTable 2Range and epoch experimentsNetwork Inputs Outputs Epochs Range Train Range Test Success16 5 10 25 0-100 0-100 No17 5 10 50 40-70 40-70 Yes18 5 10 50 40-70 0-100 No19 5 10 100 40-70 0-100 Yes20 5 10 50 0-100 40-70 No REFERENCES Bergeron, B., Morse, A., & Greenes, R., (1989). A genericneural network based tutorial supervisor for C.A.I. 14th AnnualSymposium symposiumIn ancient Greece, an aristocratic banquet at which men met to discuss philosophical and political issues and recite poetry. It began as a warrior feast. Rooms were designed specifically for the proceedings. on Computer Applications in Medical Care. IEEE (Institute of Electrical and Electronics Engineers, New York, www.ieee.org) A membership organization that includes engineers, scientists and students in electronics and allied fields. Publishing, pp.435-439 Conklin, J. (1987, September September:see month. ). Hypertext hypertext,technique for organizing computer databases or documents to facilitate the nonsequential retrieval of information. Related pieces of information are connected by preestablished or user-created links that allow a user to follow associative trails across the : An introduction andsurvey. IEEE Computer, pp. 17-41. Dillon, A., & Gabbard, R., (1998). Hypermedia as an educationaltechnology. Review of Educational Research, 68(3), 322-349. Elsom-Cook, M., (1989). Guided discovery tutoring and bounded usermodelling. In J. Self (Ed.) Artificial intelligence and human learningpp. 65-170. London London, city, CanadaLondon,city (1991 pop. 303,165), SE Ont., Canada, on the Thames River. The site was chosen in 1792 by Governor Simcoe to be the capital of Upper Canada, but York was made capital instead. London was settled in 1826. , UK: Chapman CHAPMAN. One whose business is to buy and sell goods or other things. 2 Bl. Com. 476. & Hall. Fine, T.I., (1999). Feedforward neural network A feedforward neural network is an artificial neural network where connections between the units do not form a directed cycle. This is different from recurrent neural networks. methodology NewYork New York, state, United StatesNew York,Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of : Springer-Verlag. Hagan, M.T., Bemurth, H., & Beale, M., (1996). Neural networkdesign, Boston Boston, town, EnglandBoston,town (1991 pop. 26,495), E central England, on the Witham River. Boston's fame as a port dates from the 13th cent., when it was a Hanseatic port trading wool and wine. Having recovered from a decline in the 18th and 19th cent. : Pws Publishing. Hartley, J.R. (1993). Interacting with multimedia. UniversityComputing computing - computer , 15, 129-136. Jonassen, D., & Grabinger, R.S. (1990). Problems and issues indesigning hypertext/hypermedia for learning. In D. Jonassen (Ed.)Hypermedia for teaming, pp 3-25. Berlin, Germany Germany(jûr`mənē), Ger. Deutschland, officially Federal Republic of Germany, republic (2005 est. pop. 82,431,000), 137,699 sq mi (356,733 sq km). : Springer-Verlag. Kohonen, T., (1989). Self-organisation Noun 1. self-organisation - organizing yourself (especially organizing your own labor union)self-organizationorganization, organisation - the act of organizing a business or an activity related to a business; "he was brought in to supervise the and associative memory associative memory - content addressable memory .Berlin, Germany: Springer-Verlag. Kosko, B. (1996). Fuzzy thinking: The new science of fuzzy logic.Harlow Harlow,city (1991 pop. 79,150) and district, Essex, E England. Harlow was designated one of the new towns in 1946 to alleviate overpopulation in London. It grew rapidly to become a significant residential and industrial city. , Essex, UK: Prentice-Hall. Masters, T., (1993). Practical neural network recipes in C++.London: Academic Press. Mullier, D.J. (1999). The application of neural network andfuzzy-logic techniques to educational hypermedia. Unpublished doctoraldissertation dis��ser��ta��tion?n.A lengthy, formal treatise, especially one written by a candidate for the doctoral degree at a university; a thesis.dissertationNoun1. , Leeds Metropolitan University. [Online]. Available:www.lmu.ac.uk/ies/comp/staff/dmullier/ and www.mullier.co.uk Mullier, D.J., Hobbs, D.J., & Moore, D.J. (2002). Identifyingand using hypermedia browsing patterns. Journal Of EducationalMultimedia and Hypermedia, 11(2), 31-50 Skapura, D., (1996). Building neural networks. Boston:Addison-Wesley. Theng, Y., & Thimbleby, H., (1998). Addressing design andusability How easy something is to use. Both software and Web sites can be tested for usability. Considering how difficult applications are to use and Web sites are to navigate, one would wish that more designers took this seriously. See user interface and usability lab. issues in hypertext and on the World Wide Web by reexaminingthe "Lost in hyperspace" problem. Journal of UniversalComputer Science, 4(11), 839-855. DUNCAN Duncan,city (1990 pop. 21,732), seat of Stephens co., SW Okla., in an oil, farm, and cattle area; inc. 1892. There is an oil industry, and electronics, concrete, and apparel are manufactured. During the late 19th cent. MULLIER, LEEDS METROPOLITAN UNIVERSITY, UKE-MAIL e-mail:see electronic mail. e-mailin full electronic mailMessages and other data exchanged between individuals using computers in a network. :d.mullier@lmu.ac.uk

CiHi

Sunday, October 2, 2011

A Tutorial Supervisor for automatic assessment in educational systems.

No comments:

Post a Comment