I think the comparison between Shenmue and Majora's Mask is a very valid point of how the game would work. In my mind, Majora's Mask proved that the N64 can handle a game with a complex time system where every character has something unique that they will be doing at different times, and how cause and effect can change the outcomes. The dialogue, as stated by others, would likely have to be done in text rather than voice acted. If they were to go the completely impractical route and spread the game across however many cartridges they needed (not financially feasible, but this is in a theoretical sense so bear with me) they could likely fit voice acting along with the music into the mix. Let's say they go through with however many cartridges necessary route, they could then make use of the N64 memory card to transfer the save from one cartridge to the next, or just have it save to the memory card in the 1st place. Now for a super impractical way of getting rumble AND using the memory card, they could have it setup so if the game detects a rumble pak in the 1st player controller, then the game would look for a memory card in the 2nd player controller. This way, it would allow for people who want to use Rumble to be able to use it, while also not requiring that you have a 2nd controller to save. Anyways, those are just some of my thoughts on how it might work.
I also like the idea that somebody else mentioned of the dual controllers Goldeneye style setup, it would be odd, but as long it worked it'd be sick!