A type of machine learning where an “agent” learns optimal actions by trial and error, receiving feedback via rewards. In trading, RL agents (often deep neural Q-networks) interact with a market simulation: they choose trades, observe resulting gains/losses (rewards), and iteratively learn policies that maximize cumulative profit. As one study notes, RL “has been widely applied to develop investment and trading strategies in financial markets” because it can directly learn trading rules without explicit forecasting. A reinforcement-learning cBot might continually adapt its strategy to market changes, but training it requires careful design (e.g. shaping rewards to manage risk).