# Uncertainty and Risk, Theory and Empirics: With Applications to Big Data in Finance

25 Pages Posted: 6 Jul 2023 Last revised: 28 Aug 2023

Date Written: August 28, 2023

### Abstract

Knowing the entire universe of possible outcomes, even if not knowing what particular outcome has occurred, is a situation of known unknowns, and is the domain of risk, whereas not knowing the entire universe of possible outcomes is a situation of unknown unknowns, and is the domain of uncertainty, this is detailed in Part II. In Chapter 2, I provide a measure theoretic generalization of standard probability theory under risk to a theory of generalized probability under uncertainty. I show that the generalized probability measure over its known universe is different from a reconciled probability measure conditioned upon the known universe, but given a random variable and a generalized random variable that are reconciled over a known universe, the expectation of the random variable conditioned upon the known universe equals the expectation of the generalized random variable. I provide a rigorous distinction between public and market information. In Section 2.7, Nripesh Podder and I propose a simple resolution of the St. Petersburg Paradox that works even under risk neutrality by suggesting a fundamental rethink of the expected utility framework through the identification of the uncertainty intrinsic to a situation.

“If you toss a coin seven times, and all tosses come up heads, what is the chance of the eighth toss also turning up head?” In Chapter 3, I show that the answer depends on the level of sureness about the prior generalized probability that the coin is fair. I extend the analysis to the generalized probability of gender-neutrality of birth on the basis of the observed number of female and male births. I prove that updating the generalized probability on the basis of new information is scientific, and since the criterion of falsifiability is essential to a scientific theory, that faith consists of a zero-one prior whereas the scientific method requires a non-zero-one prior.

In Chapter 4, I replace each variable, except for each indicator or time variable, by the standard cumulative Gaussian generalized probability of its Z-score ((variable minus mean) / standard deviation)), this is a rigorization of the number of standard deviations approach to interpretation of coefficients, which also implicitly assumes Gaussian distributions, thus, I make the impacts comparable, this allows a systematic and objective definition of economic significance, which is different from statistical significance. Therefore, all regression coefficients are systematic, objective, and comparable. For structural systems, the full association of two variables is measured by the total derivative, my method makes full associations comparable as well. I systematically define a full or partial association to be actually significant positive (negative) at level l if the relevant derivative is greater than l (less than -l).

Part III provides the first set of applications of the theory and empirics of this paper to independent, systematic, and objective Big Data work in finance, using daily data to study arbitrage risk, objective measures of market efficiency, and options. In Chapter 5, I provide a general methodology to measure the arbitrage risk, which is a negative proxy for the market efficiency, of a stock for any relevant period. I apply this methodology to calculate the arbitrage risk of each U.S. exchange-listed common stock for every calendar year from 1988 to 2010. I find that market efficiency is significantly affected by turnover (negatively), the number of market makers for Nasdaq stocks (negatively), and serial correlation in the Capital Asset Pricing Model of the stock (positively). The relations between market efficiency and market capitalization (positive), bid-ask spread (negative) and institutional ownership (positive) are consistent with conventional wisdom. The impact on market efficiency of the number of securities analysts following a stock and the public float ratio of a stock are of ambiguous significance.

In Chapter 6, for every U.S.-listed security for every year between 2001 and 2017, I run four different event studies to calculate four separate objective measures of the market efficiency for that security for that year. These studies provide an objective characterization of that security’s market for that year, to determine whether it is sufficiently efficient or not. I apply these methodologies to Petrobras’s American Depositary Receipt (ADR), traded as PBR on the New York Stock Exchange, from 2001 to 2017 and conclude that the Petrobras Court reached the incorrect conclusion when it certified a 2010 to 2015 class period because the market for PBR was sufficiently efficient in 2010, 2011, and 2014, but not sufficiently efficient in 2012, 2013, and 2015. I also apply these methodologies to the valuation of each U.S.-listed firm in 2001-2017. Three examples are as follows: a) the market for GS (Goldman Sachs Group, Inc., common equity) was sufficiently efficient in each year in 2001-2017, and consequently market prices represented value for Goldman Sachs over 2001-2017; b) the market for MSFT (Microsoft Corporation, common equity) was sufficiently efficient in 2001 and 2003-2017, and therefore market prices represented value for Microsoft Corporation in 2001-2017 except for 2002; and c) the market for AAME (Atlantic American Corporation, common equity) was not sufficiently efficient in any year in 2001-2017, and therefore market prices did not represent value for Atlantic American Corporation in 2001-2017.

In Chapter 7, I use five separate measures of deviation from Put-Call Parity of options on a stock without splits or dividends as separate negative measures for efficiency of the market ecosystem consisting of the underlying stock, derivatives, and risk-free securities. I use Three-Stage Least Squares (3SLS) to estimate this structural system, separately for Nasdaq and non-Nasdaq U.S. stocks, over 1996-2015. I find, contrary to much previous theoretical and empirical work, that the impact of short sales costs & constraints on market efficiency is not significantly negative and that the impact of trading volume on market efficiency is not significantly positive.

Part IV provides the second set of applications of the theory and empirics of this paper to independent, systematic, and objective Big Data work in finance, using high-frequency intraday data for event studies and market efficiency work for equities and fixed income securities. In Chapter 8, using intraday data from TAQ, TRACE, I/B/E/S, and Capital IQ, using daily data from CRSP, Compustat, CRSP-Compustat Merged Database, and FRED, I find that, for all publicly traded U.S. stocks for 2014 - September 2021, abnormal reactions are systemically all out of the system within two hours after a potentially material event. I compile a dataset of systematic, independent, and objective characterizations of each ticker-year, ticker-halfyear, ticker-quarter, and ticker-month, as statistically and economically significant efficient, statistically and economically significant inefficient, or otherwise. I find that capital markets during the first two months of the COVID-19 lockdown were statistically and economically significantly less efficient.

In Chapter 9, I develop six systematic and ordinal direct market efficiency measures based on controlled contrasts between absolute abnormal returns in relevant halfhours versus absolute abnormal returns in control halfhours. Applying an eight-equation structural model with market efficiency as a function of exogenous factors and endogenous market activities, and each endogenous market activity as a function of exogenous factors and all other endogenous activities, using intraday data on all U.S. public companies over 2014 - September 2021, I find that market efficiency did not improve with time.

Chapter 10 studies the impact of time and characteristics of the issuing firm on market efficiency of fixed income (FI) securities. I use two different metrics, depending on how a potential material event is determined, with three different announcement windows for each, based on event studies, with intraday fixed income and equity data on all publicly traded U.S. companies over 2014 - September 2021, as separate objective and systematic measures of the efficiency of the market for an FI security. I use an eight-equation structural model with market efficiency as a function of exogenous factors and endogenous market activities, and each endogenous market activity as a function of the exogenous factors and all other endogenous market activities, and I apply Three Stage Least Squares and Errors in Variables to estimate the structural system, using panel-based instrumentation strategies. I find that market efficiency of FI securities does not improve with time, that investor valuation dispersion and short sales costs & constraints of equity of issuing firm have significant negative associations with market efficiency of FI securities, and that transaction costs & constraints of equity of issuing firm has an ambiguous association with market efficiency of FI securities.

Part VI concludes and provides suggestions for future work.

**Keywords:** Uncertainty; Risk; Generalized Probability; Event Studies; Market Efficiency; Big Data

**JEL Classification:** C13; C14; C18; G14; G12; C58; C33; C36

**Suggested Citation:**
Suggested Citation