# Difference between revisions of "SMHS DecisionTheory"

## Scientific Methods for Health Sciences - Decision Theory

### Overview

Decision theory is concerned with determining the optimal course of action when a number of alternatives, whose consequences cannot be forecasted with certainty, are present. Namely, decision theory is method to make decisions in the presence of statistical knowledge when some uncertainties are involved. In this section, we present an introduction to decision theory and illustrate its application with specific examples. Sample R codes will also be provided to help apply decision theory in the programming background.

### Motivation

Suppose a drug company is deciding whether they should market a new drug. Two of the main factors to consider including the proportion of people for which the drug will prove effective $(\theta_{1})$ and the proportion of the market the drug will capture ($\theta_{2})$. Both of these two factors are generally unknown even with experiments conducted to obtain statistical information about them. This kind of problem is one of the application where decision theory in that ultimate purpose is to decide whether to market the drug and how much to market and questions like this. So, what is decision theory and how does it work?

### Theory

• Decision theory: concerned with the problem of making decisions in the presence of statistical knowledge which sheds light on some of the uncertainty involved in the decision problem. In most cases, we will assume that these uncertainties can be considered to be unknown numerical quantities, and will represent them by $\theta$, which could be a vector or matrix.
• Statistics is directed towards the use of sample information in making references about $\theta$ without regard to the use to which they are to be put. Beside, we try to combine the sample information with other relevant aspects of the problem in order to make the optimal decisions. The relevant information include knowledge of the possible consequences of the decision, quantified by determining the loss that would be incurred for each possible decision and for various think in terms of losses and non-sample information that is useful to consider, which is called prior information considering about $\theta$ arising from sources other than statistical investigation. Generally speaking, prior information comes from past experience about similar situations involving similar $\theta$ and l as the set of all possible actions under consideration.
• The uncertain quantity $\theta$, which affects the decision process is commonly referred to as the state of nature. It is clearly important to consider what the possible states of nature are when making decisions. We use the symbol $\Theta$ to denote the set of all possible states of nature (parameter space) and $\theta$ (parameter). Loss function is an important element in decision theory. If a particular action $a_{1}$ is taken and $\theta_{1}$ turns out to be the true state of nature, then a loss function $L(\theta_{1},a_{1})$ is defined for all $(\theta,a) \in\Theta×\ell.$ For technical convenience, only loss function satisfying $L(\theta,a)≥-K>-\infty$ will be considered.
• With a statistical investigation, the outcome will be denoted as X, which is often referred to as a vector $X=(X_{1},X_{2},…,X_{n})$, where $X_{i}$ are independent observations from a common distribution. A particular realization of X will be denoted x and the set of possible outcomes is the sample space, which is denoted as $\mathcal {L}$, usually a subset of $R^{n}$, n-dimensional Euclidean space. The possible distribution of X depends on the unknown state of nature $\theta$. Let $P_{\theta}(A)$ or $P_{\theta}$ $(X\in A)$ denote the probability of the event $A(A\subset \mathcal {L}$ when $\theta$ is the true state of nature. For simplicity $X$ will be assumed to be either continuous or discrete random variable with density $\mathcal{f}(x|\theta)$. If $X$ is continuous then $P_{θ} (A)=\int_{A}\mathcal {f}(x│\theta)dx$ when $X$ is discrete $P_{\theta}(A)=\sum_{X\in A} \mathcal {f}(x│\theta)$.
• Example: In the drug example above, assume it is desired to estimate $\theta_{2}$, which is a proportion and $\Theta={\theta_{2}:0≤\theta_{2}≤1}=[0,1]$. Action need to estimate a set of $\theta_{2}$. Hence $\ell=[0,1]$. The company might determine the loss function to be

\$L(\theta_{2},a)=\left{θ_{2}-a,if \theta_{2}-1≥0 @1(a-θ_2 ),if θ_2-a≤0)┤

the loss is in units of utility. Note that an overestimate of demand is considered twice as costly as an underestimate of demand, and that otherwise the loss is linear in the error. We could perform a sample survey to get reasonable information about θ_2. For example, assume n people are interviewed and the number X who would buy the drug is observed. It might be reasonable to assume that X is B(n,θ_2 ), which has density function f(x│θ_2 )=(■(n@x)) θ_2^x (1-θ_2 )^(n-x). There could well be considerable prior information about θ_2 arising from previous introductions of new similar drugs into the market. Recall in the past, new drugs tended to capture between 1/10 and 1/5 of the market with all values between 1/10 and 1/5 being equally likely. This prior information could be modeled by giving θ_2 a U(0.1,0.2) prior density, i.e. letting π(θ_2 )=10I_((0.1,0.2) ) (θ_2). The above development of L,f and π is quite crude and usually much more detailed constructions are required to obtain satisfactory results. The techniques for doing this will be developed as we proceed.

• The main point to remember that a well-defined loss function and explicit prior information are needed in decision theory. Many statisticians chose statistical inference as a shield to ward off consideration of losses and prior information, which is a mistake for the following reasons: (1) it reports from statistical inference should be constructed so that they can be easily utilized in individual decision-making; (2) investigator may very well possess such information like losses and prior information; (3) the choice of an inference (beyond mere data summarization) can be viewed as a decision problem where the action space is the set of all possible inference statements and a loss function reflecting the success in conveying knowledge is used.