Emotion Recognition in Conversations (ERC) requires modeling complex contextual dependencies across dialog turns. While transformer-based models achieve strong performance on ERC benchmarks, several key design choices including context construction, optimization strategies, and imbalance handling remain insufficiently examined. In this work, we conduct a systematic empirical study of transformer-based ERC models across three benchmark datasets. We analyze the impact of context length and directionality, layer freezing, learning rate scheduling, parameter-efficient fine-tuning, and class imbalance mitigation strategies. Our results show that short-to-medium conversational context and moderate layer freezing provide stable and strong performance, while very long context windows, aggressive freezing, and parameter-efficient adaptation offer limited gains. Furthermore, imbalance-aware losses and data augmentation do not consistently outperform standard cross-entropy training. Overall, our findings provide practical insights into effective and stable design choices for transformer-based conversational emotion recognition.