Background: Atrial fibrillation (AF) is a cardiac arrhythmia that increases the risk of stroke and is therefore important to be diagnosed and treated. Photoplethysmography (PPG) is an unobtrusive modality for heart rate monitoring and has been shown to be a promising solution for AF detection. Especially the wrist-worn applications can provide long-term monitoring that can be helpful in detecting the intermittent and asymptomatic episodes of AF in daily living conditions. However, many of existing studies have been using relatively short measurements, collected in controlled hospital environments, for developing the AF detection models. The objective of this study was to investigate whether the performance of an AF detection model trained and tested with short measurements is generalizable to measurements in daily life. Methods: PPG, accelerometer, as well as reference electrocardiography data were measured from 32 subjects (13 continuous AF, 19 no AF) in 24-hour monitoring. An AF detection model using inter-pulse interval features was trained to classify 30-second periods as AF or non-AF. The training data consisted of a selected 5-minute segment extracted from the daily recording of each subject during sleep. The model was tested both with 5-minute segments and with 24-hour data. Results: The AF detection model developed on short recordings showed 98.6% accuracy (sensitivity: 97.4%, specificity: 99.4%, PPV: 99.1%) on 5-minute segments. In 24-hour data, representing daily life conditions, the same model showed similar sensitivity (97.6%) but lower specificity (97.4%) and PPV (95.4%). In the 24-hour recordings every patient in the non-AF group had false positive (FP) 30-second events (FPs: 38 ± 48, median 11 events/patient), compared to only 1 in short recordings (FPs: 1, median 0 events/patient). Conclusion: Testing the AF detection models intended for long-term PPG-monitoring is essential with data from daily life in order to obtain a realistic estimate of the detection accuracy.