MD AMIR KHAN
Agentic AI Engineer & Quantitative Analyst
AI Engineer and Quantitative Analyst at DXT Commodities building intelligent systems for LNG and natural gas markets. I design and deploy LLM-powered pipelines, automated market intelligence tools, and production ML models — combined with deep quantitative work in statistical modeling, energy market forecasting, and portfolio research. I turn raw market data into real-time trading decisions.
About Me
I am an AI Engineer and Quantitative Analyst at DXT Commodities (Stamford, CT), where I build production-grade AI systems and quantitative models for LNG and natural gas markets. On the AI engineering side, I architect LLM-powered notice parsing pipelines, automated web scraping and alerting systems, FastAPI backends, and full-stack market intelligence tools deployed on AWS EC2. On the quant side, I build statistical production models (OLS/Ridge regression), energy market forecasting frameworks, and supply-demand balance systems that feed directly into trading desk decisions.
My AI engineering work at DXT includes an LLM-based pipeline that parses unstructured pipeline maintenance notices for capacity impact extraction, a real-time EPNG scraper pushing Force Majeure alerts to Microsoft Teams within minutes of posting, and a full-stack Pipeline Maintenance Calendar covering 23 interstate operators across 6 portal systems with SQL Server storage and Power BI dashboards. My quantitative work includes a US LNG feed gas forecasting system covering 126 export trains achieving <3% MAPE on an 8-month holdout, and a Permian Basin production estimation model bridging the 2-month EIA reporting lag.
Alongside industry work, I contribute to academic research at Stevens Institute of Technology on Robust PCA and Dynamic Factor Portfolios, and have built and validated ML models (LightGBM, neural networks, Ridge) for options pricing, auction price prediction, and credit risk across MBS and structured products.
Core expertise:
- AI Engineering: LLM pipelines, agentic systems, FastAPI, automated scraping & alerting, AWS EC2
- Quantitative Analysis: Statistical modeling (OLS/Ridge), time series, energy market forecasting, factor modeling
- Machine Learning: LightGBM, neural networks, ensemble methods, cross-validation frameworks
- Energy Markets: LNG, pipeline capacity, Waha/Henry Hub basis, EIA/Genscape/FERC data
- Quantitative Finance: Portfolio optimization, risk modeling (VaR/CVaR), robust PCA, credit risk
3+
Years Experience
10+
Projects Completed
Education
Master's in Financial Engineering & Analytics
Stevens Institute of Technology
2024 - 2025Focus: Quantitative Finance, Algorithmic Trading, Risk Analytics, Portfolio Optimization
Bachelor in Business Administration
North South University (NSU)
2018 - 2022Major: Finance | Minor: Mathematics
Key Courses: Calculus, Linear Algebra, Differential Equations, Corporate Finance, Investment Theory, Financial Derivatives, Applied Statistics
Latest News
New Role
Joined DXT Commodities as Agentic AI Engineer & Quantitative Analyst
Started full-time at DXT Commodities (Stamford, CT) in March 2026, working on LNG and natural gas market intelligence. Building Agentic RAG systems for pipeline notice parsing, LNG feed gas forecasting pipelines, and automated trading desk alerting infrastructure.
Course Completed
Advanced RAG (Retrieval-Augmented Generation) — May 2026
Completed a 10-module Advanced RAG course covering the full LLM pipeline stack — embeddings, vector stores, hybrid retrieval, HyDE, CRAG, Self-RAG, Graph RAG, Agentic RAG with LangGraph, and RAGAS evaluation. Directly applied to production AI systems at DXT Commodities.
Research Paper
Towards a Robust PCA and Dynamic Factor Portfolios Updating
Working on a research paper with Professor Papa Momar NDIAYE. We propose a dynamic tracking algorithm that modifies classical PCA to reduce instability of risk levels and principal factors — improving timing of portfolio rebalancing and reducing transaction costs.
The algorithm ensures principal factors remain immune against perturbations on observations and stabilizes factors when updating the covariance matrix, with spectral conditions for detecting risk cluster changes that warrant a full reset of the tracking process.
Experience
DXT Commodities
Mar 2026 – Present · Full-time · Stamford, CT (Hybrid)
Agentic AI Engineer
- Designed and built an Agentic RAG system using LangChain and LangGraph to parse unstructured pipeline maintenance notices across 6 operator portal formats — architecting the full pipeline from document ingestion and chunking strategy through embedding model selection, vector store indexing, hybrid retrieval (dense + BM25), and LLM-based structured field extraction
- Engineered prompt templates and structured output schemas for capacity impact extraction — iterating on chain-of-thought prompting and output parsers to reliably extract operator, affected capacity (MMcf/d), duration, and constraint type from free-text notices with varying formats
- Implemented hybrid retrieval with contextual compression (dense embeddings + BM25 ensemble) and MMR re-ranking over a vector store of historical pipeline notices, improving extraction accuracy on ambiguous capacity constraint language by reducing irrelevant context passed to the LLM
- Built evaluation harness using RAGAS-style metrics (faithfulness, answer relevancy, context precision) to benchmark RAG pipeline quality across notice types and operator formats, enabling systematic prompt and retrieval tuning
- Deployed full-stack AI pipeline on AWS EC2 — FastAPI backend exposing RAG extraction endpoints, SQL Server persistence layer for parsed notices, and Power BI dashboard for trading desk consumption — covering 23 interstate pipeline operators
Quantitative Analyst – Market Fundamentals (LNG & Power)
- Designed and built a Permian Basin gas market intelligence system modeling daily production (~22–25 Bcf/d), pipeline egress capacity across 4 major interstate pipelines, and supply-demand balance to predict Waha basis pricing — directly supporting trading desk decisions
- Developed a real-time production estimation model using Genscape pipeline scrape data and EIA 914 reports, training a scaling model (OLS/Ridge regression) with R² and MAE validation to bridge the 2-month EIA reporting lag
- Built a US LNG feed gas forecasting pipeline covering 126 active export trains across 13 terminals (~22,600 MMcf/d nameplate), with 3-model validation framework achieving <3% MAPE on 8-month out-of-sample holdout
- Built an automated scraping and alerting system for EPNG critical notices — polling Kinder Morgan's EBB portal every 5 minutes, deduplicating notices, and pushing real-time Force Majeure and maintenance alerts to Microsoft Teams via webhook, enabling the trading desk to react within minutes of posting
- Engineered end-to-end quantitative pipelines in Python and SQL — from raw data ingestion (Genscape, EIA, FERC bulletin boards) through statistical modeling, cross-validation, and automated forecast output — replacing manual workflows across production estimation, capacity tracking, and pricing analysis
Stevens Institute of Technology
Jan 2024 – Present · Part-time · Hoboken, NJ (Hybrid)
Quantitative Portfolio Research Analyst
- Contributing to research on Robust PCA for dynamic factor portfolio updates — improving stability of risk factors and cutting rebalancing costs by 15%
- Implemented Python-based simulations using corporate bond spread data to study eigenvalue grouping, factor smoothness, and covariance structure robustness
- Developed algorithms for tracking dynamic factors and detecting data structure ruptures to enhance portfolio timing decisions
- Developed and validated ML models (Ridge regression, LightGBM, neural networks) for predicting short-term auction price movements, option pricing, and credit default probabilities
- Researched and implemented advanced risk models including VaR, CVaR, stress testing, and credit risk modeling for MBS, corporate loans, and structured products — covering PD, LGD, and EAD
- Built and analyzed models for mortgage and loan portfolio performance, incorporating prepayment risk, duration/convexity analysis, and sensitivity to macroeconomic variables
- Integrated risk parity, covariance shrinkage, and multi-asset volatility estimation techniques to enhance portfolio resilience and improve Sharpe ratios under varying market regimes
Projects
Bond Portfolio Optimization and Immunization
August 2025Comprehensive bond portfolio management system combining quantitative finance with data engineering. Implements duration matching, convexity adjustments, and immunization strategies using real-time data pipelines, automated risk calculations, and scalable portfolio optimization algorithms for fixed income portfolios.
Vasicek Bond Pricing Model - Monte Carlo, PDE & Analytical
July 2025Comprehensive implementation of the Vasicek interest rate model featuring three pricing approaches: analytical solutions, Monte Carlo simulations, and PDE finite difference methods for zero-coupon bonds.
Portfolio Optimization
July 2025Strategic asset allocation framework using modern portfolio theory, risk parity, and advanced optimization techniques with Riskfolio-Lib for multi-asset portfolio construction.
Vasicek Bond Pricing and Kalman Filtering
June 2025Multi-method fixed income modeling combining Vasicek interest rate dynamics with Kalman filtering for parameter estimation and state variable tracking in bond pricing applications.
Data Science Projects
June 2025Collection of data science applications in finance including statistical analysis, machine learning models, and data visualization for financial time series and market data.
Trading Strategy Based on MACD Signals
June 2025Technical analysis-driven trading strategy using MACD (Moving Average Convergence Divergence) indicators for signal generation, backtesting, and performance evaluation.
Cryptocurrency Forecasting Using ARIMA
June 2025Time series forecasting application for cryptocurrency price prediction using ARIMA models, stationarity testing, and model selection for optimal forecasting accuracy.
Stock Price Prediction and Trading Strategy Using LSTM
March 2025Deep learning approach to stock price prediction using LSTM neural networks, combined with algorithmic trading strategy development and performance backtesting.
Stock Brokerage System Low Level Design
February 2025High-performance stock brokerage system architecture featuring order matching engine, portfolio management, and real-time market data processing.
Option Pricing Models
February 2025Comprehensive options pricing library implementing Black-Scholes, binomial trees, and Monte Carlo methods for European and American options valuation with Greeks calculation.
SPY Momentum Alpha Backtesting
February 2025High-frequency momentum trading strategy combining data engineering and quantitative finance. Built robust data pipelines processing 2 years of SPY tick data from Polygon API, implemented real-time signal generation, and achieved 79% total return with comprehensive performance analytics and automated backtesting frameworks.
Pairs Trading Strategy
February 2025Statistical arbitrage strategy using cointegration analysis and mean reversion. Employed Euclidean distance method for pair selection with z-score based entry/exit signals.
Options Pricing Using Machine Learning
September 2024Advanced machine learning approach to options pricing combining deep learning with financial engineering. Implemented neural networks, random forests, and ensemble methods with automated feature engineering, model validation pipelines, and real-time pricing systems that outperformed traditional Black-Scholes pricing in complex market conditions.
Market Analytics Web Application
August 2024Full-stack data science application for comprehensive market analysis. Built interactive web application with real-time data ingestion pipelines, advanced data visualization dashboards, automated technical indicator calculations, and machine learning-powered trading signal generation with scalable cloud deployment.
Activities & Awards
Student Membership
CFA Society New York
Student member, actively engaged in professional events
Open Source Contribution
Riskfolio-Lib
Contributing to a leading Python library for portfolio optimization and risk management
Competition Participation
WorldQuant's 2023 International Quant Championship
Competed in crafting & testing advanced trading strategies
Certifications & Licenses
Advanced RAG (Retrieval-Augmented Generation)
Self-paced Course
May 202610-module course covering the full RAG stack — from document processing, embeddings, and vector stores through advanced retrieval techniques (HyDE, CRAG, Self-RAG, Graph RAG), Agentic RAG with LangGraph, and production deployment. Directly applicable to LLM-powered pipelines at DXT Commodities.
Complete Algorithmic Trading Course with Python, ChatGPT, ML
Udemy
July 2025Comprehensive algorithmic trading course covering Python programming, machine learning integration, and ChatGPT applications for automated trading strategies.
Akuna Capital Options 101
Akuna Capital
July 2025Professional options trading course from leading market maker covering payoff diagrams, volatility, Greeks, and market-making fundamentals.
Complete Data Science, Machine Learning, DL NLP Bootcamp
Udemy
July 2025Comprehensive bootcamp covering data science fundamentals, machine learning algorithms, deep learning, and natural language processing applications.
Taking Python to Production: Professional Onboarding Guide
Udemy
July 2025Advanced Python course focusing on production deployment, best practices, and professional development workflows for enterprise applications.
The Ultimate JSON With Python Course + JSONSchema & JSONPath
Udemy
July 2025Comprehensive JSON handling in Python including schema validation, path queries, and advanced data manipulation techniques.
Master Time Series Analysis and Forecasting with Python 2025
Udemy
June 2025Advanced time series analysis covering ARIMA, SARIMA, Prophet, LSTM, and modern forecasting techniques for financial and business applications.
Manage Finance Data with Python & Pandas: Unique Masterclass
Udemy
July 2025Specialized course on financial data management and analysis using Python and Pandas for quantitative finance applications.
Master Regression & Prediction with Pandas and Python [2025]
Udemy
July 2025Advanced regression analysis and prediction modeling using Python and Pandas for financial and statistical applications.
Mathematics-Basics to Advanced for Data Science And GenAI
Udemy
July 2025Comprehensive mathematics foundation covering linear algebra, calculus, probability, and statistics for data science and AI applications.
Python Object Oriented Programming (OOP): Beginner to Pro
Udemy
July 2025Advanced Python OOP concepts including inheritance, polymorphism, design patterns, and enterprise-level programming practices.
The Complete SQL Bootcamp (30 Hours): Go from Zero to Hero
Udemy
July 2025Comprehensive SQL training covering database design, complex queries, optimization, and real-world database management scenarios.
Fixed Income Analytics: Pricing and Risk Management
Udemy
July 2025Specialized fixed income course covering bond pricing, yield curve analysis, duration, convexity, and interest rate risk management.
Learn Python Requests
Udemy
July 2025Specialized Python course focusing on HTTP requests, API integration, and web scraping for financial data collection.
The Ultimate Pandas Bootcamp: Advanced Python Data Analysis
Udemy
July 2025Advanced Pandas mastery for complex data manipulation, analysis, and visualization in financial and business contexts.
FastAPI - The Complete Course 2025 (Beginner + Advanced)
Udemy
July 2025Modern Python web framework for building high-performance APIs, essential for financial data services and algorithmic trading platforms.
Mathematical Foundations of Machine Learning
Udemy
2025Deep mathematical foundations covering linear algebra, partial derivatives, calculus, and probability theory for advanced machine learning applications.
Python Data Analysis: NumPy & Pandas Masterclass
Udemy
2025Advanced data analysis techniques using NumPy and Pandas for quantitative finance and statistical computing applications.
GSX Verified Certificate for Probability - The Science of Uncertainty and Data
MIT / edX
December 2022Rigorous probability theory course covering uncertainty quantification, statistical inference, and data analysis fundamentals from MIT.
Python and Statistics for Financial Analysis
Coursera
February 2022Specialized course combining Python programming with statistical methods for financial data analysis and investment decision making.
Technical Skills
Programming Languages
Data Science & ML Libraries
Cloud & DevOps
AI Engineering & LLM
Development & Tools
Data Science & Analytics
Quantitative Finance
Energy Markets & Infrastructure
Get In Touch
amir.khan@dxt.com
Phone
+1 (201) 234-7017
Location
Stamford, Connecticut, USA
linkedin.com/in/amirkhan2317
Portfolio
Get In Touch
I'd love to hear from you! Please feel free to reach out through any of the following methods:
Email: amir.khan@dxt.com
Phone: +1 (201) 234-7017
Location: Stamford, Connecticut, USA