Approximately 4.5 hours. This includes vocabulary (2 hours), sentence patterns (1 hour), and example conversations (1.5 hours). You should aim to listen to this entire library at least 10 times before your N5 exam.
Japanese is a pitch-accent language. The word hashi can mean "bridge" (high-low) or "chopsticks" (low-high). Without the , a beginner has no way of distinguishing these. The audio files provide a consistent model of Tokyo-standard pitch. minna no nihongo n5 kotoba audio