_{1}

This research presents an agent-based approach to finding near-optimal solutions to the newsvendor problem with price-dependent demand. The classical newsvendor problem is pursued where the decision of order quantity needs to be made in order to maximize expected profit. Here, the additional caveat of price-sensitive demand is included. This means that price (
*P*) and order quantity (
*Q*) are decision variables under the control of the newsvendor, with the intent of maximizing the expected value of the associated profit. The solution approach exploits an agent-based strategy, where an artificial agent traverses a grid-coordinate system of Price and Quantity values, where each unique Price and Quantity combination results in an expected profit. The agent-based approach consistently results in optimal solutions to a problem from the literature.

The classical newsvendor (formerly known as “newsboy”) problem involves individual selling items on a periodic basis (such as daily newspapers). The individual needs to determine how many items to purchase to re-sell in the face of uncertain demand to maximize the expected profit [

Another variable to consider is the price. A fundamental law of economics tells us that for commodity items, demand decreases as price increases. Conversely, demand increases as price decreases [

Because of these decision variables in the face of uncertainty, we need to pursue a solution approach that considers these entities simultaneously. This is the theme of this research effort. The approach presented here exploits an agent-based approach—something in the spirit of Ant Colony Optimization, or similar [

The subsequent sections of this paper explain the methodology used in terms of the profit function dependent on price and quantity, describe the development of the space representing price, quantity and expected profit, and describe the agent-based simulation model used to finding the value of price and quantity that maximizes the expected profit. Conclusions are also offered and opportunities for subsequent research are also discussed.

The methodology used for this effort is essentially comprised of three parts: the first part details the newsvendor model with price-sensitive demand; the second part details the construction of the search space for the artificial agent; and the third part details the simulation model used to guide the artificial agent’s search for optimality.

The newsvendor problem, in the context of this research strives to maximize profit associated with the purchase and subsequent re-selling of some item. With that in mind, the following definitions are provided in

Term | Definition |
---|---|

P | unit selling price |

Q | units purchased |

C | unit purchase cost |

V | unit salvage value (P > C > V) |

S | unit shortage cost |

D | units demanded |

π | net profit for period |

a | intercept for price-sensitive demand |

b | slope for price sensitive demand |

As stated previously, demand is stochastic and price-sensitive. As such, the following relationship more specifically describes demand:

D = a − b P + ε , (1)

where

ε ~ N ( μ D , σ D ) (2)

Combining revenue and cost functions in terms of the above, the following profit function results:

π = ( P − V + S ) ∗ min ( Q , D ) − S D − ( C − V ) ∗ Q (3)

Ideally, taking the derivative of profit with respect to P and Q and setting these equal to zero, then solving for P and Q is preferred, but the discontinuous nature of the equation, and the fact that D is stochastic, makes this process unrealistic, resulting in conditional approaches [

In order to have an artificial agent search for a combination of price and quantity, a search space must be constructed. In order to do this, the following definitions are provided in

The number of subintervals, n, is the same for both price and quantity so that the search grid is square. The pseudocode to determine the grid values is as follows:

for i = 0 to n {

P = P_{min} + i × ((P_{max} − P_{min})/n)

for j = 0 to n {

Q = Q_{min} + j × ((P_{max} − P_{min})/n)

for k = 1 to m {

calculate π(P, Q)}

calculate π ¯ , s

}}

calculate ρ

Term | Definition |
---|---|

P_{min} | minimum price explored |

P_{max} | maximum price explored |

Q_{min} | minimum quantity explored |

Q_{max} | Maximum quantity explored |

n | Subintervals |

m | simulated replications |

i | price index |

j | quantity index |

k | replication index |

π ¯ | average profit |

s | standard deviation of profit |

ρ | percentile rank of profit |

For each unique combination of P and Q, the value of π is computed m times. This is done to account for the stochastic nature of Demand (D). As such, profit is simulated for each combination of P and Q, resulting in subsequent values of π ¯ .

The pseudocode above provides (n + 1)^{2} grid points. Each grid point has values of P, Q, π ¯ , and s. In addition, the percentile rank (ρ) for profit is calculated for each of the (n + 1)^{2} grid points.

The end-result is a grid that takes on the following general structure:

At this point there is an (n + 1) by (n + 1) search grid for the artificial agent to traverse. To repeat, each grid point has a value of P, Q, π ¯ , s and ρ. The intent is to create an artificial agent to find the grid point where π ¯ is maximized. The definitions in

The following subsections describe the search associated with the artificial agent.

Prior to the commencement of the search, several terms are initialized. Values of P, Q, π ¯ , s and ρ, as described in the previous section are used as values for their respective grid points. The visitation index (𝓋) for each grid point is initialized to 1. The agent index (𝒶) is initialized to 1 [

An artificial agent is created and located at a random position on the Price-Quantity (P, Q) grid. The time index (t) is initialized to zero.

Term | Definition |
---|---|

S | set of neighbors of current agent |

α | grid point amplifier |

τ | grid point threshold |

p_{i} | probability of agent moving to neighbor i |

π_{B} | best profit |

π_{C} | current profit |

t | time index |

T | duration of agent’s life |

𝒶 | agent index |

A | number of agents used in simulation |

Terms Associated with Unique Grid Points | |

π ¯ | average profit |

𝓋 | visitation index |

ρ | percentile value |

The agent moves to a neighboring grid point.

In this figure, there are several artificial agents shown, denoted by letters, and in black cells. Each agent has neighboring grid points—points that are adjacent to the agent, and these cells are grey. There are agents in the corners of the grid (I, K, A and D), and each of these agents have three neighbors. There are agents on the edge of the grid (J, E, H and B), and each of these agents have five neighbors. There are agents in the middle of the grid (L, C, F, M, G), and these agents each have eight neighbors. Regardless of where the agent is located, the agents that are neighbors of the current agent are in the set S.

The current agent will move to a neighboring grid point. Monte-Carlo simulation is used to select the neighbor. Neighbor i of the current agent has the following probability of being selected:

This probability has two components. The first component is the relative desirability of the expected profit of neighbor i. The second component relates to the relative desirability of neighbor i. This of relative desirability of neighbor i as a surrogate measure of pheromone—a tool used to encourage or discourage possible grid points.

The agent moves to the chosen neighboring grid point. The values of P, Q, π ¯ , s and ρ for the new grid point are already known. The profit associated with the new grid point is π_{C}. If this value of π_{C} is the best profit thus far, the value of π_{B} is

replaced with the value of π_{C}. If the percentile rank value of the new grid point (ρ) exceeds a threshold value (τ), the following adjustment occurs:

𝓋 = (1 + α) (5)

Otherwise, this adjustment occurs:

𝓋 = (1 − α) (6)

This adjustment of the visitation (or pheromone) index is done to enhance desirable grid points, and to reduce less desirable grid points.

If the value of 𝒶 is equal to A, the simulation is over, and all relevant outputs are reported. Otherwise, 𝒶 is incremented by 1 (𝒶 = 𝒶 + 1). If the value of t is equal to T, the current agent is terminated, and control is returned to subsection 2.3.2. Otherwise, t is incremented by 1 (t = t + 1) and control is returned to subsection 2.3.3.

If the value of t is equal to the value of T, then the agent terminates, and control is returned to subsection 2.3.2.

Upon the end of the simulation, several values are noted: the overall best profit found (π_{B}), the Price and Quantity values associated with this profit (P and Q), and the percentage of agents that exceed the threshold value (τ) during the simulation.

The methodology presented above was carried-out via experimentation. A problem from the literature was used as an example [

Term | Definition |
---|---|

P | [$2.75, $5.75] in increments of $0.01 |

Q | [70, 370] in increments of 1 |

C | $1.00 |

V | $0.50 |

S | $1.00 |

D | ~N(60, 20) |

a | 200 |

b | 35 |

m | 250 |

The values above were substituted into Equation (3), with P and Q as variables. There are 301 values for both P and Q. Each P, Q combination was used and profit was simulated m = 250 times. This was done to get reliable estimates for both π ¯ and s for each P, Q combination. This results in a grid of (301 × 301 = 90,601) grid points. After the grid points were generated via simulation, the percentile values in terms of π ¯ were then determined and recorded as ρ. The above data set (grid values) was generated via the Java Programming Language.

In terms of using artificial agents to find the values of P and Q that maximize π ¯ , an environment was constructed using NetLogo, a software package that specializes in Agent -Based modeling [

The coloration of the grid emulates a heat map, intended to show relative values of expected profit ( π ¯ ). In this context, the heat map represents three-dimensional data [

The search parameters above show that each agent had a lifetime of (2000) time units, and each simulation used (50) agents. Experimentation showed the most favorable threshold value (τ) was 0.72, with the most favorable amplifier value (α) to be 0.009. The NetLogo simulation was performed 200 times at these settings,

Term | Value |
---|---|

α | 0.009 |

τ | 0.72 |

T | 2000 |

A | 50 |

and the average profit was found to be $359.17, with the optimal profit being $359.29. A formal hypothesis test was performed to see if this average profit found was statistically different from the optimal of $359.29. The test is as follows:

H 0 : μ = 359.29 ; H A : μ < 359.29 (7)

The above provided a test statistic of t = −0.29, with an associated p-value of 0.3842. As such, we cannot reject the H_{0}, which enables us to claim our results are no different from optimal.

Another performance-related metric that was studied was the percentage of agents that met the threshold (τ) during the simulation. During the simulation, there is no guarantee an agent will find the optimal solution. In fact, there is no guarantee that an agent will even find a good solution. This is because the agent is randomly assigned a starting point, and the agent’s movements are always random to some degree. This fact is the very reason that multiple agents are used in a simulation. For this effort, it was discovered that 92.46% of the ants simulated met the threshold of τ, which was the 72^{nd} percentile. The standard deviation was 3.53% of the agents. It turns out that finding the optimal solution was easier than finding a high-value of agents that met the threshold value of τ. The percentage of agents meeting the threshold of τ is considered a measure of robustness for the approach. The values of τ and α essentially control the amount of pheromone associated with each of the grid points. The combination of τ = 0.72 and α = 0.009 was the result of rigorous experimentation.

Methodology has been presented to address the newsvendor problem with Price-Sensitive and stochastic demand. An agent-based approach is employed to provide a novel search approach to the problem with essentially optimal results.

One of the unique things associated with the test problem is that we (as observers) already know the optimal solution before we pursue the optimal solution via the artificial agent search. Because the artificial agent is “blind,” we find this approach reasonable. The blindness of the artificial agent exists because the agent does not know where the optimal solution is—only the observer knows this. The agent only knows about properties of neighboring grid points (values of P and Q), and behaves accordingly. The challenge of the agent-based approach is to instruct the agent to find the optimal solution via the most efficient-possible means. For larger, perhaps more “real-world” types of problems the observer is much less likely to know the optimal solution. This is a common trait for problems of the “hill-climbing variety.”

This effort presents many opportunities for subsequent research. The normal distribution was used here to model the stochastic nature of demand (D). There are, of course, other distributions that could be used to model demand (exponential, etc.). There are other types of agent-based approaches that could be used to find optimal and/or desirable solutions. Specifically, other mechanisms to emulate pheromone could be explored. Additionally, other, particularly larger, problems could be addressed. Also, our profit function was expressed in expected profit ( π ¯ )—this means that there is a degree of uncertainty associated with expected profit. A different objective function, considering the degree of variation associated with the expected value, could be considered an opportunity for subsequent research.

In summary, the conclusions can be condensed as follows:

· An agent-based approach is used to address the Price-Sensitive Newsvendor Problem with essentially optimal results and in a computationally-efficient manner.

· There are opportunities for subsequent research involving:

o Various probability distributions could be explored.

o Different pheromone approaches could be studied.

o Larger, more complex problems could be addressed.

o Other objective functions could be pursued.

The author declares no conflicts of interest regarding the publication of this paper.

McMullen, P.R. (2020) An Agent-Based Approach to the Newsvendor Problem with Price-Dependent Demand. American Journal of Operations Research, 10, 101-110. https://doi.org/10.4236/ajor.2020.104006