A method and system for providing a content adaptive and legibility retentive display of a lecture video on a miniature video device where the lecture video comprises a sequence of textual and non-textual frames along with associated audio. The method comprises creating a metadata that indicates location of newly added data points in textual frames temporally spaced part by a predefined time interval by computing horizontal and vertical projection profiles of ink pixels in said textual frames and detecting x-y positions of newly added data points thereof; and sequentially displaying key-frames extracted from the textual and non-textual frames in accordance with the metadata by panning textual key-frames with a selection window having an aspect ratio and size in accordance with a display screen of the miniature video device and a center point as x-y position of newly added data point in the respective textual frame.