Dynamic routing via reinforcement learning for network traffic optimization