Examples of Supervised and Unsupervised Learning

Types of Learning

ANN Classification is an example of Supervised Learning. Known class labels help indicate whether the system is performing correctly or not. This information can be used to indicate a desired response, validate the accuracy of the system, or be used to help the system learn to behave correctly.

Clustering is an example of Unsupervised Learning where the class labels are not presented to the system that is trying to discover the natural classes in a dataset. Clustering often fails to find known classes because the distinction between the classes can be obscured by the large number of features (genes) which are uncorrelated with the classes.

Receptive Field

The receptive field of a neuron is a region of space in which the presence of a stimulus will alter the firing of that neuron.


  • The neocognitron, proposed by Fukushima (1980), is a hierarchical multi layered neural network capable of robust visual pattern recognition through learning.
  • The lowest stage is the input layer consisting of two-dimensional array of cells, which correspond to photo receptors of the retina.
  • The retina is a multi-layered sensory tissue that lines the back of the eye.
  • It contains millions of photoreceptors that capture light rays and convert them into electrical impulses. These impulses travel along the optic nerve to the brain where they are turned into images.
  • There are two types of photoreceptors in the retina: rods and cones. The retina contains approximately 6 million cones. The cones are contained in the macula, the portion of the retina responsible for central vision
  • Cones function best in bright light and allow us to appreciate color.
  • There are approximately 125 million rods. They are spread throughout the peripheral retina and function best in dim lighting.
  • The rods are responsible for peripheral and night vision.

Mathematical Model

A mathematical model is an abstract model that uses mathematical language to describe the behaviour of a system. Mathematical models are used particularly in the natural sciences and engineering disciplines (such as physics, biology, and electrical engineering) but also in the social sciences (such as economics, sociology and political science); physicists, engineers, computer scientists, and economists use mathematical models most extensively
Eykhoff (1974) defined a mathematical model as 'a representation of the essential aspects of an existing system (or a system to be constructed) which presents knowledge of that system in usable form'.
Examples of mathematical models
Model of rational behavior for a consumer. In this model we assume a consumer faces a choice of n commodities labelled 1,2,...,n each with a market price p1, p2,..., pn. The consumer is assumed to have a cardinal utility function U (cardinal in the sense that it assigns numerical values to utilities), depending on the amounts of commodities x1, x2,..., xn consumed. The model further assumes that the consumer has a budget M which she uses to purchase a vector x1, x2,..., xn in such a way as to maximize U(x1, x2,..., xn). The problem of rational behavior in this model then becomes an optimization problem, that is:

Max U(x1,x2,…,xn)
subject to:
xi>=o,where i={1,2,3…,n}

Classification of mathematical models

Mathematical models can be classified in several ways, some of which are described below.

1.Linear vs. nonlinear: Mathematical models are usually composed by variables, which are abstractions of quantities of interest in the described systems, and operators that act on these variables, which can be algebraic operators, functions, differential operators, etc. If all the operators in a mathematical model present linearity, the resulting mathematical model is defined as linear. A model is considered to be nonlinear otherwise. In a mathematical programming model, if the objective functions and constraints are represented entirely by linear equations, then the model is regarded as a linear model. If one or more of the objective functions or constraints are represented with a nonlinear equation, then the model is known as a nonlinear model.

2.Deterministic vs. probabilistic (stochastic): A deterministic model is one in which every set of variable states is uniquely determined by parameters in the model and by sets of previous states of these variables. Therefore, deterministic models perform the same way for a given set of initial conditions. Conversely, in a stochastic model, randomness is present, and variable states are not described by unique values, but rather by probability distributions.

3.Static vs. dynamic: A static model does not account for the element of time, while a dynamic model does. Dynamic models typically are represented with difference equations or differential equations
4.Discrete and continuous Models:Discrete networks(with discrete values) and continuous network (with continuous values)

Kohonen Networks

Kohonen Networks

Check out this page which gives the lerning algorithjm very clearly followed by a demonstration which gives an idea about the networks.

Meaning of the term Stochastic

Stochastic, from the Greek "stochos" or "aim, guess", means of, relating to, or characterized by conjecture and randomness. A stochastic process is one whose behavior is non-deterministic in that a state does not fully determine its next state.

Stochastic Neural Networks

Stochastic neural networks are a type of artificial neural networks, which is a tool of artificial intelligence. They are built by introducing random variations into the network, either by giving the network's neurons stochastic transfer functions, or by giving them stochastic weights. This makes them useful tools for optimization problems, since the random fluctuations help it escape from local minima.Stochastic neural networks that are built by using stochastic transfer functions are often called Boltzmann machine.

Simulated Annealing

Simulated annealing (SA) is a generic probabilistic meta-algorithm for the global optimization problem, namely locating a good approximation to the global optimum of a given function in a large search space.

The name and inspiration come from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects. The heat causes the atoms to become unstuck from their initial positions (a local minimum of the internal energy) and wander randomly through states of higher energy; the slow cooling gives them more chances of finding configurations with lower internal energy than the initial one.By analogy with this physical process, each step of the SA algorithm replaces the current solution by a random "nearby" solution; if the new solution is better, it is chosen, whereas if it is worse, it can still be chosen with a probability that depends on the difference between the corresponding function values and on a global parameter T (called the temperature), that is gradually decreased during the process. The dependency is such that the current solution changes almost randomly when T is large, but increasingly "downhill" as T goes to zero. The allowance for "uphill" moves saves the method from becoming stuck at local minima—which are the bane of greedier methods.

Statistical Learning - Bayesian Logic

Bayesian logic

Named for Thomas Bayes, an English clergyman and mathematician, Bayesian logic is a branch of logic applied to decision making and inferential statistics that deals with probability inference: using the knowledge of prior events to predict future events.

Bayes' Theorem is a means of quantifying uncertainty. Based on probability theory, the theorem defines a rule for refining an hypothesis by factoring in additional evidence and background information, and leads to a number representing the degree of probability that the hypothesis is true. To demonstrate an application of Bayes' Theorem, suppose that we have a covered basket that contains three balls, each of which may be green or red. In a blind test, we reach in and pull out a red ball. We return the ball to the basket and try again, again pulling out a red ball. Once more, we return the ball to the basket and pull a ball out - red again. We form a hypothesis that all the balls are all, in fact, red. Bayes' Theorem can be used to calculate the probability (p) that all the balls are red (an event labeled as "A") given (symbolized as "|") that all the selections have been red (an event labeled as "B"):

p(A|B) = p{A + B}/p{B}

Of all the possible combinations (RRR, RRG, RGG, GGG), the chance that all the balls are red is 1/4; in 1/8 of all possible outcomes, all the balls are red AND all the selections are red. Bayes' Theorem calculates the probability that all the balls in the basket are red, given that all the selections have been red as .5 (probabilities are expressed as numbers between 0. and 1., with "1." indicating 100% probability and "0." indicating zero probability).

Preliminary Study Material

Preliminary Study Material

Just check this out. Gives really simple explanation with good examples to get a preliminary grasp of the subject.

Sigmoid functions in neural networks

Sigmoid functions are often used in neural networks to introduce nonlinearity in the model and/or to clamp signals to within a specified range. A popular neural net element computes a linear combination of its input signals, and applies a bounded sigmoid function to the result; this model can be seen as a "smoothed" variant of the classical threshold neuron.

A reason for its popularity in neural networks is because the sigmoid function satisfies the differential equation y' = y(1 − y)

The right hand side is a low order polynomial. Furthermore, the polynomial has factors y and (1 − y), both of which are simple to compute. Given y = sig(t) at a particular t, the derivative of the sigmoid function at that t can be obtained by multiplying the two factors together. These relationships result in simplified implementations of artificial neural networks with artificial neurons.

Neural Network and Connectionist Models

The human brain is an incredibly impressive information processor, even though it "works" quite a bit slower than an ordinary computer. Many researchers in artificial intelligence look to the organization of the brain as a model for building intelligent machines.
Think of a sort of "analogy" between the complex webs of interconnected neurons in a brain and the densely interconnected units making up an artificial neural network (ANN), where each unit--just like a biological neuron--is capable of taking in a number of inputs and producing an output. Consider this description: "To develop a feel for this analogy, let us consider a few facts from neurobiology. The human brain is estimated to contain a densely interconnected network of approximately 10 ^ 11 neurons, each connected, on average, to 10 ^ 4 others. Neuron activity is typically excited or inhibited through connections to other neurons. The fastest neuron switching times are known to be on the order of 10 ^ -3 seconds---quite slow compared to computer switching speeds of 10 ^ -10 seconds.
Yet humans are able to make surprisingly complex decisions, surprisingly quickly. For example, it requires approximately 10^ -1 seconds to visually recognize your mother. Notice the sequence of neuron firings that can take place during this 10^ -1-second interval cannot possibly be longer than a few hundred steps, giving the switching speed of single neurons. This observation has led many to speculate that the information-processing abilities of biological neural systems must follow from highly parallel processes operating on representations that are distributed over many neurons. One motivation for ANN systems is to capture this kind of highly parallel computation based on distributed representations."

[From Machine Learning (Section 4.1.1; page 82) by Tom M. Mitchell, McGraw Hill Companies, Inc. (1997).]

Blue Brain Project

Inside our head nestles a forest of millions of neurons which weave together to make our thoughts. Man has long wanted to discover the secrets of the brain, and has done so with varying degrees of success.

Recently advancements in this area of science have been limited by the power of computers. But at Switzerland's École Polytechnique Fédérale de Lausanne, the Blue Brain Project aims to change this by simulating the structures and functions of the brain.

The Blue Brain project is the first comprehensive attempt to reverse-engineer the mammalian brain, in order to understand brain function and dysfunction through detailed simulations.

Want to know more about the great Blue Brain Project please use this link

The network in artificial neural network

The word network in the term 'artificial neural network' arises because the function f(x) is defined as a composition of other functions gi(x), which can further be defined as a composition of other functions. This can be conveniently represented as a network structure, with arrows depicting the dependencies between variables.
A widely used type of composition is the nonlinear weighted sum, where , where K is some predefined function, such as the hyperbolic tangent. It will be convenient for the following to refer to a collection of functions gi as simply a vector g = (g1,g2,...,gn).

Economical uses of ANN

Economic Uses of ANN:

The economic uses of ANNs may be the most exciting.

Large financial institutions have used ANNs to improve performance in such areas as bond rating, credit scoring, target marketing and evaluating loan applications. These systems are typically only a few percentage points more accurate than their predecessors, but because of the amounts of money involved, they are very profitable. ANNs are now used to analyze credit card transactions to detect likely instances of fraud.
ANNs are used to discover other kinds of crime, too. Bomb detectors in many U.S. airports use ANNs to analyze airborne trace elements to sense the presence of explosive chemicals. And the personnel office of the Chicago Police Department uses ANNs to try to root out corruption among police officers.

Architecture of Neural Network

Feed-forward networks
Feed-forward ANNs (figure 1) allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organisation is also referred to as bottom-up or top-down.
Feedback networks
Feedback networks (figure 1) can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.
Figure 4.1 An example of a simple feedforward network
Figure 4.2 An example of a complicated network
Network layers
The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of "output" units. (see Figure 4.1)
The activity of the input units represents the raw information that is fed into the network.
The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units.
The behaviour of the output units depends on the activity of the hidden units and the weights between the hidden and output units.
This simple type of network is interesting because the hidden units are free to construct their own representations of the input. The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.
We also distinguish single-layer and multi-layer architectures. The single-layer organisation, in which all units are connected to one another, constitutes the most general case and is of more potential computational power than hierarchically structured multi-layer organisations. In multi-layer networks, units are often numbered by layer, instead of following a global numbering.
The most influential work on neural nets in the 60's went under the heading of 'perceptrons' a term coined by Frank Rosenblatt. The perceptron (figure 4.4) turns out to be an MCP model ( neuron with weighted inputs ) with some additional, fixed, pre--processing. Units labelled A1, A2, Aj , Ap are called association units and their task is to extract specific, localised featured from the input images. Perceptrons mimic the basic idea behind the mammalian visual system. They were mainly used in pattern recognition even though their capabilities extended a lot more.
In 1969 Minsky and Papert wrote a book in which they described the limitations of single layer Perceptrons. The impact that the book had was tremendous and caused a lot of neural network researchers to loose their interest. The book was very well written and showed mathematically that single layer perceptrons could not do some basic pattern recognition operations like determining the parity of a shape or determining whether a shape is connected or not. What they did not realised, until the 80's, is that given the appropriate training, multilevel perceptrons can do these operations

Refer to it

Refer to this site this gives the a detailed descripiton about neural networks and details :

Intro to Artificial Neural Network

A neural network is an interconnected group of nodes, akin to the vast network of neurons in the human brain.

More complex neural networks are often used in Parallel Distributed Processing.
Where are Neural Networks applicable?
..... or are they just a solution in search of a problem?
Neural networks cannot do anything that cannot be done using traditional computing techniques, BUT they can do some things which would otherwise be very difficult.
In particular, they can form a model from their training data (or possibly input data) alone.
This is particularly useful with sensory data, or with data from a complex (e.g. chemical, manufacturing, or commercial) process. There may be an algorithm, but it is not known, or has too many variables. It is easier to let the network learn from examples.
Neural networks are being used:
in investment analysis:
to attempt to predict the movement of stocks currencies etc., from previous data. There, they are replacing earlier simpler linear models.
in signature analysis:
as a mechanism for comparing signatures made (e.g. in a bank) with those stored. This is one of the first large-scale applications of neural networks in the USA, and is also one of the first to use a neural network chip.
in process control:
there are clearly applications to be made here: most processes cannot be determined as computable algorithms. Newcastle University Chemical Engineering Department is working with industrial partners (such as Zeneca and BP) in this area.
in monitoring:
networks have been used to monitor
· the state of aircraft engines. By monitoring vibration levels and sound, early warning of engine problems can be given.
· British Rail have also been testing a similar application monitoring diesel engines.

in marketing:
networks have been used to improve marketing mailshots. One technique is to run a test mailshot, and look at the pattern of returns from this. The idea is to find a predictive mapping from the data known about the clients to how they have responded. This mapping is then used to direct further mailshots.

Artificial Neural Networks - Their Use and Their Application

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques.
Other advantages include:

Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience.
Self-Organisation: An ANN can create its own organisation or representation of the information it receives during learning time.
Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability.
Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance.
Neural networks process information in a similar way the human brain does. The network is composed of a large number of highly interconnected processing elements(neurones) working in parallel to solve a specific problem.

An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation; the training mode and the using mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.

Applications of neural networks:

*sales forecasting

*industrial process control

*customer research

*data validation

*risk management

*target marketing

The computing world has a lot to gain front neural networks. Their ability to learn by example makes them very flexible and powerful. Furthermore there is no need to devise an algorithm in order to perform a specific task; i.e. there is no need to understand the internal mechanisms of that task. They are also very well suited for real time systems because of their fast response and computational times which are due to their parallel architecture.

Neural networks also contribute to other areas of research such as neurology and psychology. They are regularly used to model parts of living organisms and to investigate the internal mechanisms of the brain.

It was often assumed in the early years of neural network research that implementation in special hardware would be required to take advantage of their capabilities. Such hardware, in particular, would probably be analog and involve multiple parallel processing elements and connections between them. However, the tremendous growth in the digital computing power of conventional von Neuman machines has allowed NNW simulations in software to achieve great success in a number of applications. Meanwhile, the development of hardware especially designed for NNWs has been slow and with only modest commercial success. This overview looks at some possible reasons for this slow development and some of the areas where hardware NNWs in fact have been very useful and where future growth will occur.

NNW Applications in General

NNW's, despite all appearances to the contrary, are appearing in ever increasing numbers of real world applications and are making real money:
OCR (Optical Character Recognition)
· Caere Inc ($3M profit on $55M revenue in 1997) "OmniPage Pro 6.0 significantly
increases accuracy with its exclusive Quadratic Neural Network(TM) (QNN)
technology, an enhancement to its industry-leading OCR engine..."
Data Mining
· HNC ($23M profit on $110M revenue in 1997). Their flagship product is Falcon.
"Falcon is a neural network-based system that examines transaction, cardholder, and
merchant data to detect a wide range of credit card fraud...".

These days a purchase of a new scanner typically includes a commercial OCR program. The algorithms used are proprietary but most OCR programs are believed to use NNWs. (Calera, started in 1986, did not admit to using NNW in its OCR programs until 1992 when Caere began advertising the use of them in its OCR products). Designers of OCR programs may choose NNWs to accomplish one or more of these steps with NNWs while using for other steps other techniques such as conventional AI (If-Then rules), statistical models, hidden Markov models, etc. The point is that NNWs are becoming commonly used tools but, just like other math techniques such as FFT and least squares fit, they are still only tools, not the whole solution. Few real problems of interest can be totally solved by a single NNW.

Introduction to ANN

An Artificial Neural Network is a network of many very simple processors ("units"), each possibly having a (small amount of) local memory. The units are connected by unidirectional communication channels ("connections"), which carry numeric (as opposed to symbolic) data. The units operate only on their local data and on the inputs they receive via the connections.The design motivation is what distinguishes neural networks from other mathematical techniques: A neural network is a processing device, either an algorithm, or actual hardware, whose design was motivated by the design and functioning of human brains and components thereof.
There are many different types of Neural Networks, each of which has different strengths particular to their applications. The abilities of different networks can be related to their structure, dynamics and learning methods.
Neural Networks offer improved performance over conventional technologies in areas which includes: Machine Vision, Robust Pattern Detection, Signal Filtering, Virtual Reality, Data Segmentation, Data Compression, Data Mining, Text Mining, Artificial Life, Adaptive Control, Optimisation and Scheduling, Complex Mapping and more.

Refer to URL:

Artificial Neural Network

Introduction to ANN


Artificial Neural Networks (ANNs) is an abstract simulation of a real nervous system that contains a collection of neuron units communicating with each other via axon connections. Such a model bears a strong reasemblance to axons and dendrites in a nervous system.
The first fundamental modeling of neural nets was proposeed in 1943 by McCulloch and Pitts in terms of a computational model of "nervous activity". The McCulloch-Pitts neuron is a binary device and each neuron has a fixed threshold logic. This model lead the works of Jhon von Neumann, Marvin Minsky, Frank Rosenblatt, and many others.

Hebb postulated, in his classical book The Organization of Behavior, that the neurons were appropiately interconnected by self-organization and that "an existing pathway strenghens the connections between the neurons". He proposed that the connectivity of the brain is continually changing as an organism learns different functional tasks, and that cells assemblies are created by such changes. By embedding a vast number of simple neurons in an interactive nervous system, it is possible to provide computational power for very sophisticated informating processing. The neural model can be divided into two categories:

The first is the biological type. It encompasses networks mimicking biological neural systems such as audio functions or early vision functions.
The other type is application-driven. It depens less on the faithfulness to neurobiology. For this models the architectures are largely dictated by the application needs. Many such neural networks are represented by the so called connectionist models.

ANN-How They Learn

Artificial neural networks typically start out with randomized weights for all their neurons. This means that they don't "know" anything and must be trained to solve the particular problem for which they are intended. Broadly speaking, there are two methods for training an ANN, depending on the problem it must solve.

A self-organizing ANN (often called a Kohonen after its inventor) is exposed to large amounts of data and tends to discover patterns and relationships in that data. Researchers often use this type to analyze experimental data.

A back-propagation ANN, conversely, is trained by humans to perform specific tasks. During the training period, the teacher evaluates whether the ANN's output is correct. If it's correct, the neural weightings that produced that output are reinforced; if the output is incorrect, those weightings responsible are diminished. This type is most often used for cognitive research and for problem-solving applications.

Artificial Neural Networks

Introduction to neural networks

Like nanotechnology, neural networking is the use of technology to design and manufacture (intelligent) machines, built for specific purposes, programmed to perform specific tasks. However, unlike nanomachines, neural networks are designed to work like a nerve cell system, more similar to the workings of the human or biological brain in in its physical form.

With today's complex society there is a growing need for semi-autonomous systems that can do some of the thinking and controlling for us. The logic of a neural network approximates our own thinking structures the closest and gives us the opportunity to endow specific intelligence to designed control systems.

Neural Network applications

What exactly are neural networks used for? Artificial neural networks are powerful tools for use in classification, empirical modeling and pattern recognition, for example. They are useful in fields as diverse as financing and investing, business, medical, sports, science and manufacturing.

They are used to "predict" the rise and fall of stock prices, race course predictions (horse and dog racing), hospital length of stay, weather forecasting, earthquake prediction, plastics and concrete testing, gene recognition.

In the field of robotics and artificial intelligence, artificial neural networks are crucial to the development of the robotic brain, its logic, its ability to learn, its processing and analyses of input.

Neural network software and programming

In view of the complexity in designing neural networks it is not surprising that computers play a major role. No computer without software and applications made for working with neural networks, such as design, logic and implementation, are becoming more plentiful and mainstream. However, this is a growth industry and as such there always room for writing your own.

Neural network hardware

On the hardware front of neural network systems great strides have been made. Mimicking or simulating a neural network can be done in different ways. The biological approach necessitates the need to grow and condition or program actual biological nerve cells into specific behavior.

Introduction to ANN

Artificial neural networks are computers whose architecture is modeled after the brain. They typically consist of many hundreds of simple processing units which are wired together in a complex communication network. Each unit or node is a simplified model of a real neuron which fires (sends off a new signal) if it receives a sufficiently strong input signal from the other nodes to which it is connected. The strength of these connections may be varied in order for the network to perform different tasks corresponding to different patterns of node firing activity.
Neural networks are very different - they are composed of many rather feeble processing units which are connected into a network. Their computational power depends on working together on any task - this is sometimes termed parallel processing. There is no central CPU following a logical sequence of rules - indeed there is no set of rules or program. Computation is related to a dynamic process of node firings. This structure then is much closer to the physical workings of the brain and leads to a new type of computer that is rather good at a range of complex tasks.

Dear Students
I extend a warm welcome to all of you to join me and explore the one of the most interesting, challenging and highly explored research area of Computer Science - Artificial Neural Networks. This area provides solutions to varied problems in a very efficient manner. So let's get together to know what this area is all about and what is in store for us. All the very best for this semester.
