Let's get things into reference... at 0:10 Link is BARELY as tall as the vines are long (maybe longer!), a few feet from his position. In the bright SWAMP THING screen shot, Link is well into the foreground -- so if we were to place him deeper into the background closer to the vines, his relative image would definitely shrink, becoming comparable to the size of the vines. So as i see it, when Link dives, his body made a vertical plunge and he totally submerges himself with this feet getting below the surface. But he soon slows down, as a simulated viscous Zelda liquid would cause, and changes his angle, shown at 0:14 by looking straight into his eyes in which this line traces back to the far edge of the water surface that touches the tree behind him. If he was still vertical, and we were looking into his eyes as such, we should be able to get a look at the sky, but that's not the case. Even at 0:14, the vines still seem close by. Why? cuz they're BIG.
In conclusion, Link barely reaches the top of the archway/cave opening, which isn't even halfway down to the "real" bottom (SWAMP THING screen shows a nice flat floor, but the Watery Jungle-ish screen shows tons of crap in the bottom blocking much of the archway). Link will command the animals by throwing cats into the action as bait incentives.